Implications for WCF RIA Services

Sep 6, 2011 at 6:31 PM

I had missed this demo until now. I like the design but I can't help thinking that there is lots of code overlap between WCF RIA Services itself and the OccasionallyConnected code. As a code demonstration this is a good thing but in practice if an application is being built using WCF RIA Services then I think OccasionallyConnected could be rewritten as a reusable code library instead of something that has to be custom written for every application.

I am interested in other people's thoughts on the matter. Is a version of this demo that is specially designed for WCF RIA Services something that people would be interested in?

Sep 8, 2011 at 9:42 AM

I would absolutely be interested in a WCF RIA Services specific version of this demo.

Remco

Coordinator
Oct 3, 2011 at 7:09 PM

Hi guys,

The offline demo we assembled for Tech Ed wasn't able to utilize RIA services as much as we'd have liked due to the lack of offline support of RIA services. They've always wanted to support this scenario, however the current product just doesn't go far enough to persist data locally, into a indexable store nor track changes. The other problem we wanted to solve is when to get the information, and where from. Some data (messages) can be retreived locally, even when online. This means faster response, and less overhead on the server for that list of states that doesn't change.

Other data requests, such as the list of stops to make, should be made online if connected, but leverage the local store when not. For each online request, we save the last result.

We've encapsulated the data to be sent to the server in a means that is abstract of RIA Services, WCF, REST or JSON.

We agree we'd like to take this further, and are still hopeful we can polish this a bit more in the future.

Thanks,

Steve

Coordinator
Oct 3, 2011 at 7:11 PM

I looked into doing things like serializing the Entity Container, but there were two main reasons why I went the direction I did,

  1. Performance
    1. I was expecting the archiecture to support a 500MB+ cache with 25k plus entities
    2. Serializing the container would mean a large amount of wait time whenever you access your items
    3. If you kept it in memory, your application will have a huge memory footprint when you really only want to use only a few entities
    4. I wanted querying on the serialized objects without having to maintain my own 'meta index'
  2. Architecture
    1. The idea with the messaging archiecture is that each call is 100% independent so the calls have no effect on each other
    2. I wanted to support a bunch of messaging/transport technologies (which I admit might be a 'non-goal' for others)

That being said, there is no reason you can't put an EntityContainer into the cache and make your own messages around it.  I still hope to one day break the demo out from the 'offline framework' =)

Oct 3, 2011 at 9:07 PM

After spending more time looking at the code I realized that I misunderstood the code the first time around. Namely, I missed that some of the objects were coming from RIA Services and some of the objects were coming from a separate service. I knew you were trying to show multiple types but I thought it was either or, not both at the same time. I thought you were having to manually write a POCO object for every RIA Services object that was going to be offline which was my main objection.

I do have two remaining objections to the code as is:

1) The RIA Services objects are not being serialized correctly to isolated storage. If you change one of the entities while offline then when you go online you would find that the original values and current values of the entity are the same, the original values of the entity were lost meaning that the change tracking built into RIA Services is getting lost. Depending on the validation rules and concurrency settings client side that would be a very bad thing. This could easily be solved by integrating RIA Services Contrib into the solution to provde serialization/deserialization of entities. I gave the example of being able to serialize the entire EntityContainer at once as just an example of what could be done there,
2) Every RIA Services Entity object already has a GetIdentity method that returns the primary key of that entity. If the client store was modified to use that key instead of an artifical GUID then I think that would improve performance as it would make it easier to load related entities from offline storage.

Oct 3, 2011 at 9:24 PM

I just spent some time looking at Sterling. I can provide a Deserialization module for Sterling that will know how to serialize/deserialize Entities correctly however the act of deserializing the entity would add the entity to the DomainContext and I am not sure that would work for you or not.

The other solution is to serialize/deserialize the EntityStateSet (or your own equivalent) instead of the Entity itself.

Coordinator
Oct 4, 2011 at 11:07 PM
ColinBlair wrote:

1) objects are not being serialized correctly to isolated storage. If you change one of the entities while offline then when you go online you would find that the original values and current values of the entity are the same, the original values of the entity were lost meaning that the change tracking built into RIA Services is getting lost.

2) Every RIA Services Entity object already has a GetIdentity method that returns the primary key of that entity. If the client store was modified to use that key instead of an artifical GUID then I think that would improve performance as it would make it easier to load related entities from offline storage.

1) I agree 100% with you.  Using the version of RIA Services out at that time, I did not know of a great way to serialize/deserialize RIA entities without losing that information.  Take that as being a downside of building this to be useful for a demo (MIX 2011) AND as a general guideline at the same time. 

2) I also agree with you.  It would take a small amount of code to make that happen and I think that would be a great idea to add into the framework if I ever revisit it (which I want to if I ever get more time). 

I guess that is my way to say, "yup, great feedback!" =)

Nov 3, 2011 at 10:46 AM
Edited Nov 3, 2011 at 3:46 PM

All,

I've not had a chance to have a more in depth look at this demo since I first found this SL Offline App demo, until now. I would like to share a few of my observations. Please correct me if I’ve misunderstood anything.

I compared this demo with DevForce. I have not used DevForce, but have been reading up on it since yesterday. DevForce claims to solve the same problem space as RIA Services and that they share a common, high-level approach. It also claims to support offline scenarios out-of-the-box. And that you can use the same query in both online mode against the server and offline mode against the cache. This sounds like a great advantage over RIA Services.

My observations are that both RIA Services and DevForce have a similar cache. In RIA Services the cache is the EntityContainer held in the DomainContext (not to be confused with another EntityContainer class defined in the SL Offline App Demo which is used to store entities in the Sterling database). In DevForce the cache is the EntityCache held in the EntityManager. With RIA Services you cannot use the same query that loads entities from the server into the cache when online (this query is defined on the DomainService, not the DomainContext) to also query the cache when offline. However, both caches are purely in-memory caches. And although both caches can be persisted by serializing to isolated storage, to be able to query them they will have to be fully loaded into memory again.

So, to meet the performance goals outlined by Jason's post above, we can choose to use Sterling db, abstracted behind an IObjectStore interface. DevForce does hint that you may choose to use a local database for large datasets here: http://drc.ideablade.com/xwiki/bin/view/Documentation/prepare-offline, but I have not found any further guidance or examples. I suspect you will not be able to use the same query for online mode to also query the object store and thus DevForce no longer has an advantage over RIA Services here. I suspect you may also end up with a message based design with DevForce.

DevForce has a QueryStrategy/FetchStrategy enum with values like CacheOnly, DataSourceOnly, DataSourceThenCache. Looking at the SL Offline App Demo again, it has a MessageDataSource enum with values like CacheOnly, ServerOnly etc. Sounds similar, but in this case cache means the object store, not the EntityContainer held in the DomainContext. It appears to ignore the entities cached in the EntityContainer as it either gets the entities from the server when online, or from the object store when offline.

Of course the EntityContainer is used for change tracking, so any entities loaded from the server or the object store must always be added to the EntityContainer. It would be great to see Colin add a module for Sterling db to (de)serialize entities to/from the EntityContainer with proper original and current values or alternatively we can use the EntityStateSet from WCF RIA Services Contrib as Colin suggested.

One thing that DevForce does seem to do well is allow entities to be created offline using temporary ids and then fix up the real ids when online.

About the message based design: with Silverlight 5 RC now having System.Threading.Tasks I would like to see the use of Tasks for all async operations instead of the ThreadPool. The Message class does seem to have a lot in common with Tasks, but of course Message needs to be serializable, where Task is not serializable. I’m not sure how much of the Message behaviour can be replaced with Tasks.

Being able to save changes while offline is very important to me. I can see the SL Offline App Demo can create a message for saving changes while offline that will only be able to be delivered when online. I wonder what will happen if you save changes to an entity while offline, then continue working with the same entity and save further changes again, while still offline. So you now have two messages to save changes and I suspect the first message will also end up containing the changes made to the entity after the message was created and that once online again after the first message is successfully delivered that the second message has no more changes to save (assuming the second message is not sent until the first has completed, but I’m not sure if that assumption is correct). Or will it get a concurrency violation exception? Messages in an async world can arrive in a different order, so if each message had its own specific changes then they would have to be delivered in the same order they were created.

Regarding synchronising with messages, since these are tasks kicked off from a different thread, not from the ViewModel, I suppose the ViewModel cannot provide continuations to these tasks (or actually, who not?), but only subscribe to an event as is done in the SL Offline App demo (and DevForce as well I think). I don’t like the big switch statements in this event handler though. I would prefer to see something more strongly typed, with a separate handler for each type of entity and type of change. Perhaps Reactive Extensions can help out here, since we’ll be dealing with a stream of raised events. I was also wondering if the sync thread could not directly update the EntityContainer in the DomainContext and the ViewModels simply observe the changes in the EntityContainer using one of the many collection binding options provided by RIA Services (see http://blogs.msdn.com/b/kylemc/archive/2010/12/02/collection-binding-options-in-wcf-ria-services-sp1.aspx) and abstracting the DomainContext / EntityContainer behind an interface as in http://code.msdn.microsoft.com/silverlight/Task-based-MVVM-Sample-for-2eb86fab (not necessarily wrapping it in a service agent). I also have to take Colin's warnings into account (see http://forums.silverlight.net/t/217498.aspx/1/10?Re+MVVM+and+WCF+RIA+Services+abstracting+RIA+Service+details+) about programming against an ObservableCollection in the ViewModel when at runtime you are totally relying on the behavior of an EntityList deriving from ObservableCollection.

many thanks

Remco

 

Nov 4, 2011 at 1:23 AM

Hi Remco (and others) -

I try not to intrude on this space with my product and my bias. In this case, I am speaking up because you investigated DevForce, opined about it, and asked for my input. I want to remain respectful of this space so I may redirect certain detailed questions of narrow interest toward my company's forum (http://www.ideablade.com/forum/default.asp)

DevForce clients can use the same query object to query entities on a remote server, in cache, and in a local SQL database. I mean that literally. Compose the query in LINQ and aim it where you want it.

Yes we have an entity cache analagous to RIA's. We have a built-in mechanism for serializing/deserializing all or selected entity (and entity graphs). Serialized entities **preserve their original and changed states** so that you do not lose information when rehydrating them. It preserves temporary-id information so you can cope with entities whose keys map to identity columns.

As you observed, one does not query against serialized data. Instead, you load an EntityManager from the serialized stream and query against that. Feel free to create as many EntityManagers as you like; none of them have to be online.

I can speak to the performance question regarding caches of entities serialized to local file. The load times are a function of file access speed and data volume. In general, a few megs of entities load quickly (much faster than going over the wire); in-memory cache query speeds are obviously fast.

I doubt this is the right approach for maintaining 500Mb of offline data. Some scenarios might still make sense. For example, you might  maintain smaller pockets of data in separate files (e.g. Customer A data in one, Customer B data in another) and load just what you need on demand. But, unless you're lucky, you should be thinking about a local database. The scenario I'm most familiar with is the offline ProductCatalog. The app enables offline shopping against the catalog (which could be huge). New orders are held as serialized cache files. The ProductCatalog could be maintained by replication.

DevForce is good to go here when you're building full .NET clients (DevForce is not limited to SL). I'm not sure that we have a code sample for querying both a remote and local SQL database in the same app but we have customers doing just that.

I don't think there is anything comparable that fits the bill in Silverlight. I rather doubt DevForce would work out-of-the-box for a Sterling db. DevForce local store support is intended for clients running full .NET, EF, and a flavor of SQL database; it isn't data store agnostic. There is no LINQ for "IObjectStore" to my knowledge. You'd probably have to roll your own data access layer and pour the results into entities ... an assignment that I suspect would take about the same effort for RIA or DF.

[Aside: we'd be tempted to take on this assignment if we detected signficant customer interest. But in the last ten years I can count on one hand the number of customers who actually needed a local database for offline scenarios. That explains why we haven't produced a sample.]

Message-based architectures won't help you if you need access to 500Mb of offline data.

Messages are great when you have small payloads of mostly homogeneous data that you can represent properly. Of course that's an architecture radically different from the entity-oriented architecture of RIA and DevForce. I don't know how you'd shoe-horn that into your RIA (or DevForce) app which encourages you to think in terms of arbitrarily large and amorphous sets of changed entities ("change-sets"). In RIA/DevForce you pretty much just say "SaveChanges" for everything. If you go with messaging (CQRS), you're heading for lots of separate message types, each with a distinct payload (SaveNewClaim, AppendPhotoToClaim, ReviseClaimantAddress, etc.)

In an entity-oriented world, stacking a sequence of saves that potentially involve the same entities can be tricky. One approach we've used with modest success is to serialize "transactions" (each a meaningful change-set or edit session) as separate entity cache files. Later you replay them in succession (load-cache-from-file and SaveChanges) when the application goes back on line. This is kind of a homebrew messaging approach in which each saved change-set is a generalized "message". And YES you would have to manage the concurrency properties (if you had them) to push through successive changes to the same entities. We have some sauce for that.

No matter what you do, you have to cope with the user's expectation that each save would have succeeded if you were online. If a "message" fails during replay, your biggest challenges surround communicating with the user (who probably forgot all about it), resolving conflicts, and deciding if you can process the next message in the queue.

That's why I caution people against getting too carried away in their offline scenarios. I look for natural seams in the workflow (e.g., editing a Customer claim). The user experiences this as a sandboxed edit session. You let the user rework the entities in the sandbox as far as your business logic allows while offline. There are no save sequences within a sandbox session. You just accumulate changes. Each sandbox has its own EntityManager (its own cache) and you serialize it to local storage until the connection is reestablished; then you rehydrate the sandbox and call "SaveChanges". A restored sandbox succeeds or fails on its own. If it suffers, you can present the sandbox in a manner that is familiar to the user; in effect, you recreate the same experience they had when they prepared the save(s) ... only this time they get to work out any conflicts with the server in real time.

--
You mentioned something about passing continuations to async operations. You doubted DF would facilitate that? We do. Events are only one option for resuming when an async operation finishes. You can pass continuations in callbacks ... and I usually go the callback route anyway. I thought RIA did this too. As for managing sequences of async calls, DevForce ships with a Coroutine component that will hold the fort until the await key word arrives.
--
So much for "a few remarks".

Coordinator
Nov 4, 2011 at 8:21 AM

You know, I started reading Remco's post and was thinking that I should send an email to Ward Bell about it and BOOM, there he is =)  Like magic.  Feel free to continue this discussion here, I have no issue with DevForce talk here as long as it is on topic for 'offline connectivity for Silverlight/WP'.

 

As for Remco's post:

I wonder what will happen if you save changes to an entity while offline, then continue working with the same entity and save further changes again, while still offline. So you now have two messages to save changes.

My demo here only keeps one version of the entity, and it keeps it in the cache until it hits the wire.  So the first save will update the cache and create a message message 'looks' like "save guid blab blah", and then when you open that item (still offline), it will open the version with pending changes.  When you save that entity, it will update the cache again and create another message that 'looks' like "save guid blab blah", which is a functional copy of an existing, pending, message and gets ignored.  Therefore, there is only one version of the entity at a time AND it will only save once.

As for Ward's post:

Ward is 100% right in his (unstated) comment that things get crazy in this space really fast.  I completely gloss over the scenario of what happens if you, while offline, edit an entity, edit it again, edit it again, and now want to see what changed between edit #1 and #2.  Or what if you want to undo edit #3?  I know some people (Rocky of CSLA fame for example) have put a crazy amount of thought and work into solving that exact problem (and his CSLA solution would work with this demo).  

As to the 500mb scale thoughts.  I compeletely agree that if you scenario is to start up the app, load 500mb of data into it from cache, and display it, this will not solve the problem.  But if you are working with 500mb of customer data, you can keep a 'lookup' entity with all the customer names (for list boxes and the like) and, because sterling using an index, you will only have to load the one customer they selected.

I also like the idea of not replication systems to keep your larger objects 'current', but instead, display the stale information (with a notification that a newer version is downloading now) and update it in the background when it arrives.  Of course, there are MANY systems where this functionality would be a really bad idea, such as an medical records. 

Jason R. Shaver,Microsoft Silverlight PM

Nov 4, 2011 at 12:41 PM
Edited Nov 4, 2011 at 12:54 PM

Jason, thanks for allowing the comparison with DevForce on your forum. I did not mean any disrespect either. I think the discussion had started on the WCF RIA Services forum, but since we found such a great demo here of a SL Offline App, the discussion seemed to have moved here and that seems to make sense. It's great to have your, Colin's and Ward's input helping me (and others) to assess the best choice of technology / design / architecture.

Ward, thank you for your feedback as well. I take in all your points. With regards to significant customer interest for local databases in rich internet apps as opposed to full .Net apps: I suppose with Silverlight we cannot look back 10 years. I'm not sure how long Silverlight has had isolated storage, but even HTML 5 will be getting isolated storage. Perhaps customers may only just be waking up (or at least me and my client and its customers are) that offline capabilities are now possible in rich internet apps and realising there may be a need for local databases in rich internet apps. That said, I suppose if there was a MS SQL Server flavour that worked in SL isolated storage, you would also need the Entity Framework to run in SL. Right now Sterling db is one of the few options available, but yes I agree, introducing (or shoe-horning as you said) a message based architecture appears to undo the productivity benefits that the entity based architecture of RIA / DevForce gives. The message based architecture does not feel like a simple online/offline switch over the entity based architecture. I wish I could add something more constructive here.

Once again, thank you all very much, I have a lot to think about.

Nov 4, 2011 at 10:10 PM

I think Jeremy's Sterling is a great option. If this kind of thing took off there might be a LINQ provider for it. At one point there were murmurings about MS providing a local SQL engine for SL and that kind of froze things (they do that ... not intentionally but it's a killer). Anyway, if Sterling or something got traction, we could potentially write a provider and bypassed the EF intermediary (which won't exist in SL obvioulsy). I don't see that happening soon ... unless someone pays us to do it :).

As for HTML 5, yes .. they get something akin to iso-store. How long before there is a Sterling for that? Actually, I think the trend is more toward NoSQL dbs in that space ... which brings you back to serializing entities as far as I can tell. I can foresee writing cached entity graphs as indexed documents. Could do that with Sterling today I imagine. This may be where Jason is going in his comments. Or maybe that's me reading into his comments. Making my head hurt.

Nov 5, 2011 at 1:15 AM

It's pretty embarassing that I didn't realize Sterling IS a NoSQL db with LINQ support built in. I've known about Sterling for more than a year and been meaning to take it out for a spin.

Nov 7, 2011 at 12:48 PM

@wardbell you were right before( I see people not realize that) Sterling has no LINQ provider, it "support LINQ to Objects" but this is nothing more then LINQ to memory collections. Sterling is a serializer/deserializer with in-memory indexes that's all , there is no query engine, everything is in memory, this is very important because as you can imagine you cannot rely on too much data to be stored offline.

Nov 7, 2011 at 8:06 PM

Oh. Hmm. I'd want to look at it more closely before commenting. If it only worked by pulling everything into memory and behaving as an in-mem database ... well that doesn't address the large data problem and doesn't provide the hoped for management of numerous serialized caches (or cache fragments, or entity graphs, or whatever you want to call small bundles of entities that collectively comprise a semantically useful data set). I'm withholding judgement until I know more.

Coordinator
Nov 7, 2011 at 8:23 PM
cristoph1 wrote:

there is no query engine, everything is in memory, this is very important because as you can imagine you cannot rely on too much data to be stored offline.

The objects are stored offline, the only part that is in memory is the index (which is also stored offline as well), so that you can query (via LINQ) on the properties you promote to the index (in the DEMO, I just use the Entity's ID).  This means that if you extend the demo to use intelligent properties for storing your entities, you can get VERY fast querying and everything is stored in isolated storage.  What is also nice about Sterling is that when you query, it does not have to deserialize/load the entities, so your memory footprint is just the index and objects you need.  

The demo makes it pretty easy to test this.  You can run the demo, then browse to the isolated storage directory and see all the entities as files.  

Jason R. Shaver

Nov 7, 2011 at 8:55 PM

Hi everyone ... just heard about the conversation and was going to weigh in but it looks like everyone has this in hand. Sterling has an in-memory model, that is the default, but it works on the concept of drivers and has drivers for isolated storage on both the Windows Phone and the desktop flavor of Silverlight (it also has a .NET provider). It's ironic that I came across this demo now because I'm writing an example for my upcoming Line of Business book that also tackles occassionally connected. It uses a server version of Sterling with a client version of Sterling and a custom domain context with WCF RIA services to link the two. All changes are stored as event sources in Sterling, then replayed when it reconnects to synchronize with the master.

As for the comments about in-memory, as Jason mentioned, Sterling optimizes for cache and query. It holds only keys and indexes that you specify in memory and keeps everything else on disk. The idea here is to have lightning fast in-memory queries then deserialize only the data that is "interesting." The caveat is that you can reach scenarios that push it to the limit, but even integer keys on 100,000 record sets are still < 1/2 megabyte of a footprint.

The biggest two flaws I am aware of right now with Sterling are (1) versioning - it uses reflection as the dictionary/key for serializing so if the type changes it breaks the serialized data. This will be addressed in a future version but currently users have to manually handle migrations when types that are being serialized changed, and (2) file storage - right now Sterling uses folders and files for records, which increases the disk footprint and slows access. In the future we're looking at a single-file model where a table is expressed as a file with markers/pointers so the I/Os traverse a single handle rather than having to keep opening/closing multiple files. The obvious issue there is that the files can quickly become defragmented so I've been stuck on figuring out the best way to clean this up in the background without making the API overly complex or locking any threads.

If you have any other questions/concerns about Sterling please don't hesitate to contact me thorugh the site here or my Twitter @JeremyLikness.

I'll have to check out this application when I get a chance too and see how the approach compares to what I've been writing. Thanks!

Nov 7, 2011 at 9:28 PM

Hi Jeremy - Curious about the "Event Sourcing" angle as that paradigm is distant from the entity-change-set paradigm implied in the SaveChanges call. So what are the "events"? And how do you cope with server rejections during playback?  The versioning issues is the biggest headache for our customers. I don't see an easy solution but haven't thought about it too hard. Do you feel you are close?

Nov 7, 2011 at 11:05 PM
Edited Nov 7, 2011 at 11:05 PM
jeremylikness wrote:

 It holds only keys and indexes that you specify in memory and keeps everything else on disk. The idea here is to have lightning fast in-memory queries then deserialize only the data that is "interesting."

Jeremy,

Problem is that you can query database only by index/key efficiently, this I found is very important, again there is no LINQ provider, just wanted to point out that, if you want to query Sterling db by any other field than Key/Index it will load ALL objects from database (of that Type) in memory and then do LINQ to objects, isn't it? So in this case all is in memory indexes+objects.And in a business application I don't think you only need to query a database by PK/Index.

Nov 9, 2011 at 7:46 PM

See here what I'm talking about:

http://www.maxpaulousky.com/blog/archive/2011/07/27/windows-phone-mango-db-engines-performance-testing.aspx

Search by non-indexed column in a Sterling DB = 22 seconds in a total of 100 objects(with dependencies) because everything is loaded in memory first and then is made LINQ to objects.