Distributed Systems
Remote Invocation
Remote Invocation
- request-reply communication: most primitive; minor improvement over underlying IPC primitives
- 2-way exchange of messages as in client-server computing
- RPC, RMI: mechanisms enabling a client to invoke a procedure/method from the server via communication between client and server
- Remote Procedure Call (RPC): extension of conventional procedural programming model
- allow client programs to transparently call procedures in server programs running in separate processes, and in separate machines from the client
- i.e. to the caller, the procedures appear to be in local address space
- RPC system hides encoding/decoding of parameters and results, message passing, and preserves required invocation semantics
- Remote Method Invocation (RMI): extension of conventional object oriented programming model
- allows objects in different processes to communicate i.e. an object in one JVM is able to invoke methods in an object in another JVM
- extension of local method invocation: allows object in one process to invoke methods of an object living in another process
Pros and Cons of RPC, RMI
- pros:
- high transparency: caller works with remote procedure/object as if it were local
- general purpose, flexible
- cons:
- high overhead for communication, marshalling/unmarshalling: latency
Example Applications
- RPC: machine learning inference on your phone
- phone is too slow
- make RPC to server, which has resources to execute inference quickly
- RMI: checking a user’s password
Request-Reply Protocol
- most common exchange protocol for remote invocation
Operations
doOperation()
: send request to remote object, and returns the reply receivedgetRequest()
: acquire client request at server portsendReply()
: sends reply message from server to client
Design issues
- timeouts: what to do when a request times out? how many retries?
- duplicate messages: how to discard?
- e.g. recognise successive messages with the same request ID and filter them
- lost replies: dependent on idempotency of server operations
- history: do servers need to send replies without re-execution? then history needs to be maintained
Design decisions
- retry policy
- how many times to retry?
- duplicate filter mechanism
- retransmission policy
Exchange protocols
Different flavours of exchange protocols:
- request (R): no value to be returned from remote operation
- client needs no confirmation operation has been executed
- e.g. sensor producing large amounts of data: may be acceptable for some loss
- request-reply (RR): useful for most client-server exchanges. Reply regarded as acknowledgement of request
- subsequent request can be considered acknowledgement of the previous reply
- request-reply-acknowledge (RRA): acknowledgement of reply contains request id, allowing server to discard entry from history
TCP vs UDP
- limited length of datagrams may affect transparency of RMI/RPC systems which should be able to accept data of any size
- TCP can be chosen to avoid multipacket protocols, avoiding this issue
- TCP additional overheads: acknowledgements, connection establishmen
- TCP also ensures reliable delivery
- no need to filter duplicates or use histories
- TCP therefore simplifies implementation of request-reply protocol
- if application doesn’t require all of TCP facilities, more efficient, tailored protocol can be implemented over UDP
Invocation semantics
- maybe: RPC may be executed once or not at all
- unless call receives result, it is unknown whether RPC was called
- at-least-once: either
- remote procedure was executed at least once and caller received a response, or
- caller received exception to indicate remote procedure was not executed at all
- at-most-once: RPC was either
- executed exactly once, in which case caller received response, or
- not executed at all, and caller receives an exception
- level of transparency provided depends on design choices and objectives
- Java RMI supports at-most-once invocation semantics
- Sun RPC supports at-least-once
Fault tolerance
Transparency
- location and access transparency are usually goals for remote invocation
- sometimes complete transparency undesirable:
- remote invocations are more prone to failure due to network/remote machines
- latency of remote invocations significantly higher than local ones
- many implementations provide access transparency, but not complete location transparency, allowing programmer to optimise based on location
HTTP: RR protocol
- see comp sys notes
RPC
- RPCs enable clients to execute procedures in server processes based on a defined service interface
- generally implemented over request-reply protocol
RPC Roles
- communication module: implements design w.r.t. retransmission of requests, duplicate handling, result retransmission
- client stub procedure: behaves like a local procedure to client
- marshals procedure identifiers and arguments, and passes it to communication module
- unmarshals the results in the reply
- dispatcher: selects server stub based on procedure identifier, forwarding request to the server stub
- server stub procedure: unmarshalls arguments in request message, and forwards to service procedure
- marshals arguments in result message and returns to client
-
service procedure: actual procedure to call, implements procedures in the service interface
- client/server stub procedures, as well as dispatcher, can be generated automatically by an interface compiler
Sun RPC
- designed for client-server communication in NFS
- implementer can choose to make RPCs over UDP or TCP
- broadcast RPC is available
- at-least-once invocation semantics
- interface definition language XDR (external data representation) allows you to define the remote interface in a standard way
- compiler
rpcgen
compiles the remote interface - has a runtime library
- interfaces are not named, but referenced by program number (obtained from central authority) and version number
- procedure definition specifies procedure signature and procedure number
- procedure number used as a procedure identifier in requests
- procedures only allow a single input parameter
- output parameters
Remote Method Invocation
- RMI is similar to RPC, but extended to distributed objects
- a calling object is able to invoke a method in a remote object
- underlying details are generally hidden from the user
Similarities: RMI and RPC
- support programming with interfaces
- constructed on top of Request-Reply protocols
- can offer range of call semantics
- similar level of transparency
Differences: RMI and RPC
- full expressive power of OOP, methodologies and tools
- all objects in RMI-based system have unique object references: much richer parameter-passing than possible in RPC
Distributed Object Concepts
- object references: used to access objects. Can be assigned to variables, passed as arguments, returned as results
- actions: initiated by an object invoking a method in another object
- modified state of the object
- object state queried
- new object created
- tasks delegated to other objects (chain of invocations)
- garbage collection: process of releasing memory that was used by objects, but is no longer in use
Architectures
- client-server architecture: can be adopted for a distributed object system
- server manages objects
- clients invoke methods using RMI: the request is sent in a message to the server managing the object
- invocation carried out, and result returned to the client
- client/server in different processes enforces encapsulation: unauthorised method invocations are not possible
- replicated architecture: objects can be replicated to improve fault tolerance and performance
Remote Objects
- remote object: an object able to receive and make local/remote invocations
- remote object references: other objects are able to invoke the methods of a remote object if they have access to its
remote object reference
- identifier used throughout distributed system to refer to particular, unique remote object
- remote interface: every remote object has a remote interface, specifying which of its methods can be invoked remotely
- class of a remote object implements the methods of the remote interface
- CORBA Interface Definition Language: allows definition of remote interfaces. Clients don’t need to use the same programming language as the remote object in order to invoke its methods remotely
- Java RMI: become remote interfaces by extending
Remote
Distributed Actions
- actions can be performed on remote objects (i.e. in different processes):
- e.g. executing a remote method defined in the remote interface
- e.g. creating a new object in the target process
- actions are invoked using RMI
Garbage Collection
- if underlying language (e.g. Java) supports GC, any associated RMI system should allow GC of remote objects
- Distributed GC: local, existing GC cooperates with additional module that counts references to do distributed GC
Exceptions
- remote invocation may fail
- process may have crashed
- process may be too busy to reply
- result message may have been lost
- need to have all usual exceptions for local invocations
- extra exceptions for remote invocation: e.g. timeouts
- CORBA IDL allows you to specify application-level exceptions
Implementation
- communication module: communicates messages (requests, replies) between client and server
- two cooperating modules implement request-reply protocol
- responsible for implementing invocation semantics
- queries remote reference module to obtain local reference of object, then passes local reference to the dispatcher for the class
- remote reference module: creates remote object references
- maintains remote object table which maps between local and remote object references
- remote object table: has an entry for each
- remote object reference held by the process
- local proxy
- entries get added to the remote object table when
- remote object reference is passed for the first time
- remote object reference is received, and an entry is not present in the table
- servant: objects in process receiving the remote invocation
- instance of a class that provides the body of a remote object
- live within a server process
RMI Software
- software layer between application and the communication and object reference modules, composed of proxies, dispatchers, and skeletons
- proxy: behaves like a local object to the invoker making RMI transparent to clients
- usually an interface
- lives in the client
- instead of executing the invocation, it forwards it in a message to a remote object
- hides details of remote object reference, marshalling, unmarshalling, sending/receiving messages from client
- one proxy per remote object reference the process holds
- dispatcher: translates method ID to the real method
- server has 1 dispatcher + 1 skeleton for each class representing a remote object
- receives requests from the communication module
- uses
operationId
to select appropriate method in the skeleton
- skeleton: skeleton class of remote object implementing methods of remote interface
- handles marshalling/unmarshalling
- skeleton methods unmarshal arguments in the request and invoke the corresponding method in the servant
- marshals the result in a reply message to the sending proxy’s method
Development
- definition of interface for remote objects: defined using supported mechanism of the particular RMI software
- compile interface: generate proxy, dispatcher, skeleton classes
- writing server: remote object classes are implemented and compiled with classes for dispatchers and skeletons. Server is also responsible for creating/initialising objects, and registering them with the binder.
- writing client: client programs implement invoking code and contain proxies for all remote classes. Binder used to lookup remote objects
Dynamic invocation
- as proxies are precompiled to a program, they don’t allow invocation of remote interface not known during compilation
- dynamic invocation: permits invocation of generic interface using
doOperation
method for generic remote invocation
Server and client program
- server contains
- classes for dispatchers, skeletons
- initialisation section for creating/initialising at least one servant
- code for registering servants with the binder
- client contains
- classes for all proxies of remote objects
Factory Methods
- servants can’t be created by remote invocation on constructors (interfaces cannot have constructors)
- factory method: method used to create servants
- factory object: object with factory methods
- factory methods are included in the remote object interface to allow clients to create remote objects on demand
Binder
- client programs need a way to get the remote object reference of remote objects in the server
- binder: service in distributed system that supports this
- maintains table mapping textual names to object references
- servers register remote objects (by name) with the binder
- clients look them up by name
Activation of Remote Objects
- some applications require information survive for long periods, but its not practical to be kept in running processes indefinitely
- to avoid wasting resources from running all servers that manage remote objects simultaneously, servers can be started whenever needed by clients
- activator: processes that start server processes to host remote objects
- registers passive objects available for activation
- starts named server processes and activates remote objects in them
- keeps track of locations of servers for remote objects that have already been activated
- active remote object: is one that is available for invocation in the process
- passive remote object: is not currently active, but can be made active. Contains
- implementation of the methods
- state in marshalled form
- activation: creating an active object from the corresponding passive object
Persistent Object Stores
- persistent object: object guaranteed to live between activations of processes
- persistent object stores: manage persistent objects, storing their state in marshalled form on disk
- e.g. CORBA persistent state service, Java Data Objects
Object Location
- possible remote object reference: IP address + port number to help guarantee uniqueness
- can also be used as an address, as long as the object remains in the same process for the rest of its life
- location service: helps clients locate remote objects based on remote object references
- uses database that maps remote object references to probable current locations
Java RMI
- Java RMI extends Java object model to provide support for distributed objects,
- allowing objects to invoke methods on remote objects using the same syntax as for local invocations
- objects making remote invocations are aware their target is remote as it must handle
RemoteExceptions
- implementer of a remote object is aware an object is remote as it extends the
Remote
interface
Developing a Java RMI server:
- specify remote interface
- implement Servant class
- compile interface and servant classes
- generate skeleton and stub classes
- implement server
- compile server
Developing a Java RMI client:
- implement client program
- compile client program
Java RMI: Distributed Garbage Collection
- Java’s distributed GC mechanism
B.holders
: server maintains list of names holding remote object references to the object- when client
A
receives reference to remote objectB
it makes anaddRef(B)
invocation to the server, - server adds
A
toB.holders
- when
A
s GC notices no references exist for proxy of remote objectB
if makesremoveRef(B)
invocation to server - server removes
A
fromB.holders
- when
B.holders
is empty, server’s local GC releases the space - race conditions need to be addressed