ERDOS Package Reference

erdos.connect_source(op_type, config, *args, **kwargs)

Registers a Source operator to the dataflow graph, and returns the OperatorStream that the operator will write the data on.

Parameters
  • op_type (Type[Source[TypeVar(T)]]) – The Source operator that needs to be added to the graph.

  • config (OperatorConfig) – Configuration details required by the operator.

  • *args – Arguments passed to the operator during initialization.

  • **kwargs – Keyword arguments passed to the operator during initialization.

Return type

OperatorStream[TypeVar(T)]

Returns

An OperatorStream corresponding to the WriteStream made available to Source.run().

erdos.connect_sink(op_type, config, read_stream, *args, **kwargs)

Registers a Sink operator to the dataflow graph.

Parameters
  • op_type (Type[Sink[TypeVar(T)]]) – The Sink operator that needs to be added to the graph.

  • config (OperatorConfig) – Configuration details required by the operator.

  • read_stream (Stream[TypeVar(T)]) – The Stream instance from where the operator reads its data.

  • *args – Arguments passed to the operator during initialization.

  • **kwargs – Keyword arguments passed to the operator during initialization.

Return type

None

erdos.connect_one_in_one_out(op_type, config, read_stream, *args, **kwargs)

Registers a OneInOneOut operator to the dataflow graph that receives input from the given read_stream, and returns the OperatorStream that the operator will write the data on.

Parameters
  • op_type (Type[OneInOneOut[TypeVar(T), TypeVar(U)]]) – The OneInOneOut operator that needs to be added to the graph.

  • config (OperatorConfig) – Configuration details required by the operator.

  • read_stream (Stream[TypeVar(T)]) – The Stream instance from where the operator reads its data.

  • *args – Arguments passed to the operator during initialization.

  • **kwargs – Keyword arguments passed to the operator during initialization.

Return type

OperatorStream[TypeVar(U)]

Returns

An OperatorStream corresponding to the WriteStream made available to OneInOneOut.run(), or to the operator’s callbacks via the OneInOneOutContext.

erdos.connect_two_in_one_out(op_type, config, left_read_stream, right_read_stream, *args, **kwargs)

Registers a TwoInOneOut operator to the dataflow graph that receives input from the given left_read_stream and right_read_stream, and returns the OperatorStream that the operator sends messages on.

Parameters
  • op_type (Type[TwoInOneOut[TypeVar(T), TypeVar(U), TypeVar(V)]]) – The TwoInOneOut operator to add to the graph.

  • config (OperatorConfig) – Configuration details required by the operator.

  • left_read_stream (Stream[TypeVar(T)]) – The first Stream instance from where the operator reads its data.

  • right_read_stream (Stream[TypeVar(U)]) – The second Stream instance from where the operator reads its data.

  • *args – Arguments passed to the operator during initialization.

  • **kwargs – Keyword arguments passed to the operator during initialization.

Return type

OperatorStream[TypeVar(V)]

Returns

An OperatorStream corresponding to the WriteStream made available to TwoInOneOut.run(), or to the operator’s callbacks via the TwoInOneOutContext.

erdos.connect_one_in_two_out(op_type, config, read_stream, *args, **kwargs)

Registers a OneInTwoOut operator to the dataflow graph that receives input from the given read_stream, and returns the pair of OperatorStream instances that the operator will write data on.

Parameters
  • op_type (Type[OneInTwoOut[TypeVar(T), TypeVar(U), TypeVar(V)]]) – The OneInTwoOut operator that needs to be added to the graph.

  • config (OperatorConfig) – Configuration details required by the operator.

  • read_stream (Stream[TypeVar(T)]) – The Stream instance from where the operator reads its data.

  • *args – Arguments passed to the operator during initialization.

  • **kwargs – Keyword arguments passed to the operator during initialization.

Return type

Tuple[OperatorStream[TypeVar(U)], OperatorStream[TypeVar(V)]]

Returns

A pair of OperatorStream instances corresponding to the WriteStream instances made available to OneInOneOut.run(), or to the operator’s callbacks via the OneInTwoOutContext.

erdos.reset()

Create a new dataflow graph.

Note

A call to this function renders the previous dataflow graph unsafe to use.

Return type

None

erdos.run(graph_filename=None, start_port=9000)

Instantiates and runs the dataflow graph.

ERDOS will spawn 1 process for each python operator, and connect them via TCP.

Parameters
  • graph_filename (Optional[str]) – The filename to which to write the dataflow graph as a DOT file.

  • start_port (int) – The port on which to start. The start port is the lowest port ERDOS will use to establish TCP connections between operators.

Return type

None

erdos.run_async(graph_filename=None, start_port=9000)

Instantiates and runs the dataflow graph asynchronously.

ERDOS will spawn 1 process for each python operator, and connect them via TCP.

Parameters
  • graph_filename (Optional[str]) – The filename to which to write the dataflow graph as a DOT file.

  • start_port (int) – The port on which to start. The start port is the lowest port ERDOS will use to establish TCP connections between operators.

Return type

NodeHandle

Returns

A NodeHandle that allows the driver to interface with the dataflow graph.

class erdos.NodeHandle(py_node_handle, processes)

A handle to the dataflow graph returned by the run_async() function.

The handle exposes functions to shutdown() the dataflow, or wait() for its completion.

Note

This structure should not be initialized by the users.

shutdown()

Shuts down the dataflow.

Return type

None

wait()

Waits for the completion of all the operators in the dataflow

Return type

None