Introduction To RPC(Remote Procedure Call)
This blogger is also posted in another blogger
Remote Procedure Call (RPC) is the foundation
technology of the Distributed Computer System such as Hadoop. To understand the
Hadoop Framework, a good and solid understanding of RPC is a must. Here is the
article written for an introduction about RPC
In this blog, I am going to talk about:
l
The background of RPC
l
The components and data-flow of
RPC procedure
l
The popular RPC framework:
Background:
Here is a definition from Wikipedia about
the RPC. Remote Procedure Call (RPC) is an inter-process communication that
allows a computer program to cause a subroutine or procedure to execute in
another address space (commonly on another computer on a shared network)
without the programmer explicitly coding the details for this remote
interaction. That is, the programmer writes essentially the same code whether
the subroutine is local to the executing program, or remote.
Compared with the other lower level network
programming framework such as Socket programming, RPC hides most of the
difficulty such as network connection management, traffic control so that the
program can focus on the application logic with little effort on the network
itself.
PRC was first created in 1980s. The first popular implementation of RPC on Unix was Sun's RPC (now
called ONC RPC), used as the basis for Network File System. Now RPC is widely
used in almost all distributed systems.
Components and Data-Flow:
RPC client: the client program of the RPC,
it calls the server program over the network.
Client/Server Stub: encode the request for the
client and send to Runtime library, waiting for result. When the result arrives
the client, it will decode and pass to client.
RPC Runtime library: the library generates
the RPC packages and sent/receive to/from the network. It implements the network
details for RPC.
Server Skeleton: the same function in
server side, receive the request and decode, pass to server. Encode the result
and pass to Runtime library.
PRC server: provide the server function for
clients.
Typically there are components in RPC framework:
Most popular RPC frameworks:
Thrift:
Thrift was developed by Facebook then contributed
to Apache. It combines a software stack with a code generation engine to build
services that work efficiently to a varying degree and seamlessly between
multiple languages. Apache Thrift is a
binary communication protocol
Protocol buffers: are Google's language-neutral,
platform-neutral, extensible mechanism for serializing structured data – think XML, but smaller, faster,
and simpler. You
define how you want your data to be structured once, then you can use special
generated source code to easily write and read your structured data to and from
a variety of data streams and using a variety of languages – Java, C++, or
Python.
Avro: it is a Data Serialization System. it uses JSON
based schemas, and uses RPC calls to send data, the schema is sent during data
exchange
No comments:
Post a Comment