ERDOS Package Reference¶
- erdos.connect_source(op_type, config, *args, **kwargs)¶
Registers a
Sourceoperator to the dataflow graph, and returns theOperatorStreamthat the operator will write the data on.- Parameters
op_type (
Type[Source]) – TheSourceoperator that needs to be added to the graph.config (
OperatorConfig) – Configuration details required by the operator.*args – Arguments passed to the operator during initialization.
**kwargs – Keyword arguments passed to the operator during initialization.
- Return type
- Returns
An
OperatorStreamcorresponding to theWriteStreammade available toSource.run().
- erdos.connect_sink(op_type, config, read_stream, *args, **kwargs)¶
Registers a
Sinkoperator to the dataflow graph.- Parameters
op_type (
Type[Sink]) – TheSinkoperator that needs to be added to the graph.config (
OperatorConfig) – Configuration details required by the operator.read_stream (
Stream) – TheStreaminstance from where the operator reads its data.*args – Arguments passed to the operator during initialization.
**kwargs – Keyword arguments passed to the operator during initialization.
- erdos.connect_one_in_one_out(op_type, config, read_stream, *args, **kwargs)¶
Registers a
OneInOneOutoperator to the dataflow graph that receives input from the givenread_stream, and returns theOperatorStreamthat the operator will write the data on.- Parameters
op_type (
Type[OneInOneOut]) – TheOneInOneOutoperator that needs to be added to the graph.config (
OperatorConfig) – Configuration details required by the operator.read_stream (
Stream) – TheStreaminstance from where the operator reads its data.*args – Arguments passed to the operator during initialization.
**kwargs – Keyword arguments passed to the operator during initialization.
- Return type
- Returns
An
OperatorStreamcorresponding to theWriteStreammade available toOneInOneOut.run(), or to the operator’s callbacks via theOneInOneOutContext.
- erdos.connect_two_in_one_out(op_type, config, left_read_stream, right_read_stream, *args, **kwargs)¶
Registers a
TwoInOneOutoperator to the dataflow graph that receives input from the givenleft_read_streamandright_read_stream, and returns theOperatorStreamthat the operator sends messages on.- Parameters
op_type (
Type[TwoInOneOut]) – TheTwoInOneOutoperator to add to the graph.config (
OperatorConfig) – Configuration details required by the operator.left_read_stream (
Stream) – The firstStreaminstance from where the operator reads its data.right_read_stream (
Stream) – The secondStreaminstance from where the operator reads its data.*args – Arguments passed to the operator during initialization.
**kwargs – Keyword arguments passed to the operator during initialization.
- Return type
- Returns
An
OperatorStreamcorresponding to theWriteStreammade available toTwoInOneOut.run(), or to the operator’s callbacks via theTwoInOneOutContext.
- erdos.connect_one_in_two_out(op_type, config, read_stream, *args, **kwargs)¶
Registers a
OneInTwoOutoperator to the dataflow graph that receives input from the givenread_stream, and returns the pair ofOperatorStreaminstances that the operator will write data on.- Parameters
op_type (
Type[OneInTwoOut]) – TheOneInTwoOutoperator that needs to be added to the graph.config (
OperatorConfig) – Configuration details required by the operator.read_stream (
Stream) – TheStreaminstance from where the operator reads its data.*args – Arguments passed to the operator during initialization.
**kwargs – Keyword arguments passed to the operator during initialization.
- Return type
Tuple[OperatorStream,OperatorStream]- Returns
A pair of
OperatorStreaminstances corresponding to theWriteStreaminstances made available toOneInOneOut.run(), or to the operator’s callbacks via theOneInTwoOutContext.
- erdos.reset()¶
Create a new dataflow graph.
Note
A call to this function renders the previous dataflow graph unsafe to use.
- erdos.run(graph_filename=None, start_port=9000)¶
Instantiates and runs the dataflow graph.
ERDOS will spawn 1 process for each python operator, and connect them via TCP.
- Parameters
graph_filename (
Optional[str]) – The filename to which to write the dataflow graph as a DOT file.start_port (
Optional[int]) – The port on which to start. The start port is the lowest port ERDOS will use to establish TCP connections between operators.
- erdos.run_async(graph_filename=None, start_port=9000)¶
Instantiates and runs the dataflow graph asynchronously.
ERDOS will spawn 1 process for each python operator, and connect them via TCP.
- Parameters
graph_filename (
Optional[str]) – The filename to which to write the dataflow graph as a DOT file.start_port (
Optional[int]) – The port on which to start. The start port is the lowest port ERDOS will use to establish TCP connections between operators.
- Return type
- Returns
A
NodeHandlethat allows the driver to interface with the dataflow graph.
- class erdos.NodeHandle(py_node_handle, processes)¶
A handle to the dataflow graph returned by the
run_async()function.The handle exposes functions to
shutdown()the dataflow, orwait()for its completion.Note
This structure should not be initialized by the users.
- shutdown()¶
Shuts down the dataflow.
- wait()¶
Waits for the completion of all the operators in the dataflow