Tuesday, 24 December 2013

Introduction To RPC(Remote Procedure Call)

Introduction To RPC(Remote Procedure Call)

This blogger is also posted in another blogger
Remote Procedure Call (RPC) is the foundation technology of the Distributed Computer System such as Hadoop. To understand the Hadoop Framework, a good and solid understanding of RPC is a must. Here is the article written for an introduction about RPC
In this blog, I am going to talk about:
l         The background of RPC
l         The components and data-flow of RPC procedure
l         The popular RPC framework:

Background:

Here is a definition from Wikipedia about the RPC. Remote Procedure Call (RPC) is an inter-process communication that allows a computer program to cause a subroutine or procedure to execute in another address space (commonly on another computer on a shared network) without the programmer explicitly coding the details for this remote interaction. That is, the programmer writes essentially the same code whether the subroutine is local to the executing program, or remote.
Compared with the other lower level network programming framework such as Socket programming, RPC hides most of the difficulty such as network connection management, traffic control so that the program can focus on the application logic with little effort on the network itself.
PRC was first created in 1980s. The first popular implementation of RPC on Unix was Sun's RPC (now called ONC RPC), used as the basis for Network File System. Now RPC is widely used in almost all distributed systems.

Components and Data-Flow:

RPC client: the client program of the RPC, it calls the server program over the network.
Client/Server Stub: encode the request for the client and send to Runtime library, waiting for result. When the result arrives the client, it will decode and pass to client.
RPC Runtime library: the library generates the RPC packages and sent/receive to/from the network. It implements the network details for RPC.
Server Skeleton: the same function in server side, receive the request and decode, pass to server. Encode the result and pass to Runtime library.
PRC server: provide the server function for clients.
Typically there are components in RPC framework:

Most popular RPC frameworks:

Thrift: Thrift was developed by Facebook then contributed to Apache. It combines a software stack with a code generation engine to build services that work efficiently to a varying degree and seamlessly between multiple languages.  Apache Thrift is a binary communication protocol
Protocol buffers: are Google's language-neutral, platform-neutral, extensible mechanism for serializing structured data – think XML, but smaller, faster, and simpler. You define how you want your data to be structured once, then you can use special generated source code to easily write and read your structured data to and from a variety of data streams and using a variety of languages – Java, C++, or Python.

Avro: it is a Data Serialization System. it uses JSON based schemas, and uses RPC calls to send data, the schema is sent during data exchange

No comments:

Post a Comment