Friday, April 29, 2011

How to share objects across processes in .Net?

I'm trying to "share" .net objects across separate processes. I have one type of process that's a web service that manipulates a set of domain entities. Another type of process is a window service that's doing some automatic batch processing on the same set of objects.

Beyond the typical solution of having the DB as the shared space where both types of processes read/write the objects, what might be a better, more distributed architecture for having these different processes see and work on the same objects?

I've considered using a distributed cache as a shared store for objects, but that doesn't fully support objects and their relationships. Object graphs inserted into a distributed cache are flattened and objects end up stored in multiple disconnected copies.

Is a "messaging bus" the right way to go about, letting the processes send each other updated copies of the objects?

Or are there other solutions altogether to consider?

From stackoverflow
  • I would suggest using WCF-services. For local using (calling from Windows Service) you can netNamedPipeBinding. That allow you to separate this two processes physacally in the future.

    Noldorin : Yeah, WCF with named pipes is the way to go with this.
    urig : I know I can use WCF to shuffle objects back and forth between the processes, but that's just the plumbing. My question is - what's the strategy to use? Does one process own all objects or are they distributed between processes? When a process changes an object, how do the other processes become aware of this? Do they pull the update or is it pushed? Etc.
    Shrike : By "objects" do mean some domain objects? Like Customer, Products, etc? If so, I'd suggest using DTO (data transfer objects) for transfering their data. Each process will have its own copies of "objects". THen you could use optimistic locking strategy (with special field in each object) to takle with concurreny issue.
  • You're going to need to decide which of your two services, or possibly a third, as of yet unwritten service, is your authoritative source for your domain objects.

    A remoting or WCF-based set of services exposed by the authoritative source should provide the central location for your object graphs. However, I think you are running into simply creating a distributed cache for your objects.

    Have you considered the Velocity Project?

    urig : Indeed I did consider using (rather than implementing) a distributed cache to hold my objects and share them. But these products are limited in that they do not support references between the cached objects. Here's a related StackOverflow question that I posted a while ago: http://stackoverflow.com/questions/701656/real-object-references-in-distributed-cache
  • In our projects, we've used ScaleOut StateServer (a commercial product - www.scaleoutsoftware.com) to accomplish distributed caching / replication of objects throughout a server farm for similar purposes. This has been quite effective, although using objects does incur (de)serialization costs, so in many cases we simplify what we're storing to just string values where possible.

    We haven't fully evaluated the Velocity Project, since our usage started before that existed and we don't have time or compelling reasons to consider a switch at this point, but that obviously warrants some investigation if you're just starting now.

    Edit: I did indeed miss the important part about the question - the flattening of object references. This may be over-complicating things or have other drawbacks, but what if you took the approach of more closely simulating database storage in your distributed cache (keeping it to storing a single copy of each distinct object entity, and using looser references to link those entities together)?

    Example: you have a class, 'Group', which has a 'Leader' property and 'Members' collection, all containing objects that are instances of your 'Person' class. You'd have to use custom serialization to pull it off, and nothing will magically solve concurrency / dirty update problems, but the idea is that what you'd put into the distributed cache would actually be all individual 'Person' instances, as well as a shallow copy of the 'Group' instance itself. That shallow copy would serialize the normal 'Group' properties (name, etc), as well as unique identifiers for each 'Person' reference contained within (like the original database ID, a GUID, unique username, or whatever is appropriate) rather than the Person objects themselves. So you'd have a 'LeaderID' instead of the Leader, and the Members collection would serialize as a list of MemberID's. Each Person referenced is also stored as a separate object; that's where the concurrency trickiness comes into play.

    When deserializing (or on access, depending on usage patterns), the Group shallow copy would follow all Person ID references and re-hydrate those references with the real Person objects stored separately in the distributed cache. You'd need to implement locking mechanisms to make sure updates to those objects, which could be shared among many different Groups, were safe. You'd also need a versioning mechanism and a 'dirty check' whenever necessary to re-read/pick up any changes to the Person object made within the distributed cache.

    It does seem quite complicated, but that's the most generic approach I could think of without knowing the specifics of your use case.

    urig : I've actually been using the same product - SOSS - but just like all other distributed caches, it "flattens" object graphs. See my StackOverflow question here: http://stackoverflow.com/questions/701656/real-object-references-in-distributed-cache

0 comments:

Post a Comment