目录

Inputs and Readers

优质
小牛编辑
133浏览
2023-12-01

Note: Functions taking Tensor arguments can also take anything accepted by tf.convert_to_tensor.

Contents

Inputs and Readers

  • Placeholders
    • tf.placeholder(dtype, shape=None, name=None)
  • Readers
    • class tf.ReaderBase
    • class tf.TextLineReader
    • class tf.WholeFileReader
    • class tf.IdentityReader
    • class tf.TFRecordReader
    • class tf.FixedLengthRecordReader
  • Converting
    • tf.decode_csv(records, record_defaults, field_delim=None, name=None)
    • tf.decode_raw(bytes, out_type, little_endian=None, name=None)
    • Example protocol buffer
    • tf.parse_example(serialized, names=None, sparse_keys=None, sparse_types=None, dense_keys=None, dense_types=None, dense_defaults=None, dense_shapes=None, name='ParseExample')
    • tf.parse_single_example(serialized, names=None, sparse_keys=None, sparse_types=None, dense_keys=None, dense_types=None, dense_defaults=None, dense_shapes=None, name='ParseSingleExample')
  • Queues
    • class tf.QueueBase
    • class tf.FIFOQueue
    • class tf.RandomShuffleQueue
  • Dealing with the filesystem
    • tf.matching_files(pattern, name=None)
    • tf.read_file(filename, name=None)
  • Input pipeline
    • Beginning of an input pipeline
    • tf.train.match_filenames_once(pattern, name=None)
    • tf.train.limit_epochs(tensor, num_epochs=None, name=None)
    • tf.train.range_input_producer(limit, num_epochs=None, shuffle=True, seed=None, capacity=32, name=None)
    • tf.train.slice_input_producer(tensor_list, num_epochs=None, shuffle=True, seed=None, capacity=32, name=None)
    • tf.train.string_input_producer(string_tensor, num_epochs=None, shuffle=True, seed=None, capacity=32, name=None)
    • Batching at the end of an input pipeline
    • tf.train.batch(tensor_list, batch_size, num_threads=1, capacity=32, enqueue_many=False, shapes=None, name=None)
    • tf.train.batch_join(tensor_list_list, batch_size, capacity=32, enqueue_many=False, shapes=None, name=None)
    • tf.train.shuffle_batch(tensor_list, batch_size, capacity, min_after_dequeue, num_threads=1, seed=None, enqueue_many=False, shapes=None, name=None)
    • tf.train.shuffle_batch_join(tensor_list_list, batch_size, capacity, min_after_dequeue, seed=None, enqueue_many=False, shapes=None, name=None)

Placeholders

TensorFlow provides a placeholder operation that must be fed with data on execution. For more info, see the section on Feeding data.


tf.placeholder(dtype, shape=None, name=None)

Inserts a placeholder for a tensor that will be always fed.

Important: This tensor will produce an error if evaluated. Its value must be fed using the feed_dict optional argument to Session.run(), Tensor.eval(), or Operation.run().

For example:

x = tf.placeholder(float, shape=(1024, 1024))
y = tf.matmul(x, x)

with tf.Session() as sess:
  print sess.run(y)  # ERROR: will fail because x was not fed.

  rand_array = np.random.rand(1024, 1024)
  print sess.run(y, feed_dict={x: rand_array})  # Will succeed.
Args:
  • dtype: The type of elements in the tensor to be fed.
  • shape: The shape of the tensor to be fed (optional). If the shape is not specified, you can feed a tensor of any shape.
  • name: A name for the operation (optional).
Returns:

A Tensor that may be used as a handle for feeding a value, but not evaluated directly.

Readers

TensorFlow provides a set of Reader classes for reading data formats. For more information on inputs and readers, see Reading data.


class tf.ReaderBase

Base class for different Reader types, that produce a record every step.

Conceptually, Readers convert string 'work units' into records (key, value pairs). Typically the 'work units' are filenames and the records are extracted from the contents of those files. We want a single record produced per step, but a work unit can correspond to many records.

Therefore we introduce some decoupling using a queue. The queue contains the work units and the Reader dequeues from the queue when it is asked to produce a record (via Read()) but it has finished the last work unit.


tf.ReaderBase.__init__(reader_ref, supports_serialize=False)

Creates a new ReaderBase.

Args:
  • reader_ref: The operation that implements the reader.
  • supports_serialize: True if the reader implementation can serialize its state.

tf.ReaderBase.num_records_produced(name=None)

Returns the number of records this reader has produced.

This is the same as the number of Read executions that have succeeded.

Args:
  • name: A name for the operation (optional).
Returns:

An int64 Tensor.


tf.ReaderBase.num_work_units_completed(name=None)

Returns the number of work units this reader has finished processing.

Args:
  • name: A name for the operation (optional).
Returns:

An int64 Tensor.


tf.ReaderBase.read(queue, name=None)

Returns the next record (key, value pair) produced by a reader.

Will dequeue a work unit from queue if necessary (e.g. when the Reader needs to start reading from a new file since it has finished with the previous file).

Args:
  • queue: A Queue or a mutable string Tensor representing a handle to a Queue, with string work items.
  • name: A name for the operation (optional).
Returns:

A tuple of Tensors (key, value).

  • key: A string scalar Tensor.
  • value: A string scalar Tensor.

tf.ReaderBase.reader_ref

Op that implements the reader.


tf.ReaderBase.reset(name=None)

Restore a reader to its initial clean state.

Args:
  • name: A name for the operation (optional).
Returns:

The created Operation.


tf.ReaderBase.restore_state(state, name=None)

Restore a reader to a previously saved state.

Not all Readers support being restored, so this can produce an Unimplemented error.

Args:
  • state: A string Tensor. Result of a SerializeState of a Reader with matching type.
  • name: A name for the operation (optional).
Returns:

The created Operation.


tf.ReaderBase.serialize_state(name=None)

Produce a string tensor that encodes the state of a reader.

Not all Readers support being serialized, so this can produce an Unimplemented error.

Args:
  • name: A name for the operation (optional).
Returns:

A string Tensor.


tf.ReaderBase.supports_serialize

Whether the Reader implementation can serialize its state.


class tf.TextLineReader

A Reader that outputs the lines of a file delimited by newlines.

Newlines are stripped from the output. See ReaderBase for supported methods.


tf.TextLineReader.__init__(skip_header_lines=None, name=None)

Create a TextLineReader.

Args:
  • skip_header_lines: An optional int. Defaults to 0. Number of lines to skip from the beginning of every file.
  • name: A name for the operation (optional).

tf.TextLineReader.num_records_produced(name=None)

Returns the number of records this reader has produced.

This is the same as the number of Read executions that have succeeded.

Args:
  • name: A name for the operation (optional).
Returns:

An int64 Tensor.


tf.TextLineReader.num_work_units_completed(name=None)

Returns the number of work units this reader has finished processing.

Args:
  • name: A name for the operation (optional).
Returns:

An int64 Tensor.


tf.TextLineReader.read(queue, name=None)

Returns the next record (key, value pair) produced by a reader.

Will dequeue a work unit from queue if necessary (e.g. when the Reader needs to start reading from a new file since it has finished with the previous file).

Args:
  • queue: A Queue or a mutable string Tensor representing a handle to a Queue, with string work items.
  • name: A name for the operation (optional).
Returns:

A tuple of Tensors (key, value).

  • key: A string scalar Tensor.
  • value: A string scalar Tensor.

tf.TextLineReader.reader_ref

Op that implements the reader.


tf.TextLineReader.reset(name=None)

Restore a reader to its initial clean state.

Args:
  • name: A name for the operation (optional).
Returns:

The created Operation.


tf.TextLineReader.restore_state(state, name=None)

Restore a reader to a previously saved state.

Not all Readers support being restored, so this can produce an Unimplemented error.

Args:
  • state: A string Tensor. Result of a SerializeState of a Reader with matching type.
  • name: A name for the operation (optional).
Returns:

The created Operation.


tf.TextLineReader.serialize_state(name=None)

Produce a string tensor that encodes the state of a reader.

Not all Readers support being serialized, so this can produce an Unimplemented error.

Args:
  • name: A name for the operation (optional).
Returns:

A string Tensor.


tf.TextLineReader.supports_serialize

Whether the Reader implementation can serialize its state.


class tf.WholeFileReader

A Reader that outputs the entire contents of a file as a value.

To use, enqueue filenames in a Queue. The output of Read will be a filename (key) and the contents of that file (value).

See ReaderBase for supported methods.


tf.WholeFileReader.__init__(name=None)

Create a WholeFileReader.

Args:
  • name: A name for the operation (optional).

tf.WholeFileReader.num_records_produced(name=None)

Returns the number of records this reader has produced.

This is the same as the number of Read executions that have succeeded.

Args:
  • name: A name for the operation (optional).
Returns:

An int64 Tensor.


tf.WholeFileReader.num_work_units_completed(name=None)

Returns the number of work units this reader has finished processing.

Args:
  • name: A name for the operation (optional).
Returns:

An int64 Tensor.


tf.WholeFileReader.read(queue, name=None)

Returns the next record (key, value pair) produced by a reader.

Will dequeue a work unit from queue if necessary (e.g. when the Reader needs to start reading from a new file since it has finished with the previous file).

Args:
  • queue: A Queue or a mutable string Tensor representing a handle to a Queue, with string work items.
  • name: A name for the operation (optional).
Returns:

A tuple of Tensors (key, value).

  • key: A string scalar Tensor.
  • value: A string scalar Tensor.

tf.WholeFileReader.reader_ref

Op that implements the reader.


tf.WholeFileReader.reset(name=None)

Restore a reader to its initial clean state.

Args:
  • name: A name for the operation (optional).
Returns:

The created Operation.


tf.WholeFileReader.restore_state(state, name=None)

Restore a reader to a previously saved state.

Not all Readers support being restored, so this can produce an Unimplemented error.

Args:
  • state: A string Tensor. Result of a SerializeState of a Reader with matching type.
  • name: A name for the operation (optional).
Returns:

The created Operation.


tf.WholeFileReader.serialize_state(name=None)

Produce a string tensor that encodes the state of a reader.

Not all Readers support being serialized, so this can produce an Unimplemented error.

Args:
  • name: A name for the operation (optional).
Returns:

A string Tensor.


tf.WholeFileReader.supports_serialize

Whether the Reader implementation can serialize its state.


class tf.IdentityReader

A Reader that outputs the queued work as both the key and value.

To use, enqueue strings in a Queue. Read will take the front work string and output (work, work).

See ReaderBase for supported methods.


tf.IdentityReader.__init__(name=None)

Create a IdentityReader.

Args:
  • name: A name for the operation (optional).

tf.IdentityReader.num_records_produced(name=None)

Returns the number of records this reader has produced.

This is the same as the number of Read executions that have succeeded.

Args:
  • name: A name for the operation (optional).
Returns:

An int64 Tensor.


tf.IdentityReader.num_work_units_completed(name=None)

Returns the number of work units this reader has finished processing.

Args:
  • name: A name for the operation (optional).
Returns:

An int64 Tensor.


tf.IdentityReader.read(queue, name=None)

Returns the next record (key, value pair) produced by a reader.

Will dequeue a work unit from queue if necessary (e.g. when the Reader needs to start reading from a new file since it has finished with the previous file).

Args:
  • queue: A Queue or a mutable string Tensor representing a handle to a Queue, with string work items.
  • name: A name for the operation (optional).
Returns:

A tuple of Tensors (key, value).

  • key: A string scalar Tensor.
  • value: A string scalar Tensor.

tf.IdentityReader.reader_ref

Op that implements the reader.


tf.IdentityReader.reset(name=None)

Restore a reader to its initial clean state.

Args:
  • name: A name for the operation (optional).
Returns:

The created Operation.


tf.IdentityReader.restore_state(state, name=None)

Restore a reader to a previously saved state.

Not all Readers support being restored, so this can produce an Unimplemented error.

Args:
  • state: A string Tensor. Result of a SerializeState of a Reader with matching type.
  • name: A name for the operation (optional).
Returns:

The created Operation.


tf.IdentityReader.serialize_state(name=None)

Produce a string tensor that encodes the state of a reader.

Not all Readers support being serialized, so this can produce an Unimplemented error.

Args:
  • name: A name for the operation (optional).
Returns:

A string Tensor.


tf.IdentityReader.supports_serialize

Whether the Reader implementation can serialize its state.


class tf.TFRecordReader

A Reader that outputs the records from a TFRecords file.

See ReaderBase for supported methods.


tf.TFRecordReader.__init__(name=None)

Create a TFRecordReader.

Args:
  • name: A name for the operation (optional).

tf.TFRecordReader.num_records_produced(name=None)

Returns the number of records this reader has produced.

This is the same as the number of Read executions that have succeeded.

Args:
  • name: A name for the operation (optional).
Returns:

An int64 Tensor.


tf.TFRecordReader.num_work_units_completed(name=None)

Returns the number of work units this reader has finished processing.

Args:
  • name: A name for the operation (optional).
Returns:

An int64 Tensor.


tf.TFRecordReader.read(queue, name=None)

Returns the next record (key, value pair) produced by a reader.

Will dequeue a work unit from queue if necessary (e.g. when the Reader needs to start reading from a new file since it has finished with the previous file).

Args:
  • queue: A Queue or a mutable string Tensor representing a handle to a Queue, with string work items.
  • name: A name for the operation (optional).
Returns:

A tuple of Tensors (key, value).

  • key: A string scalar Tensor.
  • value: A string scalar Tensor.

tf.TFRecordReader.reader_ref

Op that implements the reader.


tf.TFRecordReader.reset(name=None)

Restore a reader to its initial clean state.

Args:
  • name: A name for the operation (optional).
Returns:

The created Operation.


tf.TFRecordReader.restore_state(state, name=None)

Restore a reader to a previously saved state.

Not all Readers support being restored, so this can produce an Unimplemented error.

Args:
  • state: A string Tensor. Result of a SerializeState of a Reader with matching type.
  • name: A name for the operation (optional).
Returns:

The created Operation.


tf.TFRecordReader.serialize_state(name=None)

Produce a string tensor that encodes the state of a reader.

Not all Readers support being serialized, so this can produce an Unimplemented error.

Args:
  • name: A name for the operation (optional).
Returns:

A string Tensor.


tf.TFRecordReader.supports_serialize

Whether the Reader implementation can serialize its state.


class tf.FixedLengthRecordReader

A Reader that outputs fixed-length records from a file.

See ReaderBase for supported methods.


tf.FixedLengthRecordReader.__init__(record_bytes, header_bytes=None, footer_bytes=None, name=None)

Create a FixedLengthRecordReader.

Args:
  • record_bytes: An int.
  • header_bytes: An optional int. Defaults to 0.
  • footer_bytes: An optional int. Defaults to 0.
  • name: A name for the operation (optional).

tf.FixedLengthRecordReader.num_records_produced(name=None)

Returns the number of records this reader has produced.

This is the same as the number of Read executions that have succeeded.

Args:
  • name: A name for the operation (optional).
Returns:

An int64 Tensor.


tf.FixedLengthRecordReader.num_work_units_completed(name=None)

Returns the number of work units this reader has finished processing.

Args:
  • name: A name for the operation (optional).
Returns:

An int64 Tensor.


tf.FixedLengthRecordReader.read(queue, name=None)

Returns the next record (key, value pair) produced by a reader.

Will dequeue a work unit from queue if necessary (e.g. when the Reader needs to start reading from a new file since it has finished with the previous file).

Args:
  • queue: A Queue or a mutable string Tensor representing a handle to a Queue, with string work items.
  • name: A name for the operation (optional).
Returns:

A tuple of Tensors (key, value).

  • key: A string scalar Tensor.
  • value: A string scalar Tensor.

tf.FixedLengthRecordReader.reader_ref

Op that implements the reader.


tf.FixedLengthRecordReader.reset(name=None)

Restore a reader to its initial clean state.

Args:
  • name: A name for the operation (optional).
Returns:

The created Operation.


tf.FixedLengthRecordReader.restore_state(state, name=None)

Restore a reader to a previously saved state.

Not all Readers support being restored, so this can produce an Unimplemented error.

Args:
  • state: A string Tensor. Result of a SerializeState of a Reader with matching type.
  • name: A name for the operation (optional).
Returns:

The created Operation.


tf.FixedLengthRecordReader.serialize_state(name=None)

Produce a string tensor that encodes the state of a reader.

Not all Readers support being serialized, so this can produce an Unimplemented error.

Args:
  • name: A name for the operation (optional).
Returns:

A string Tensor.


tf.FixedLengthRecordReader.supports_serialize

Whether the Reader implementation can serialize its state.

Converting

TensorFlow provides several operations that you can use to convert various data formats into tensors.


tf.decode_csv(records, record_defaults, field_delim=None, name=None)

Convert CSV records to tensors. Each column maps to one tensor.

RFC 4180 format is expected for the CSV records. (https://tools.ietf.org/html/rfc4180) Note that we allow leading and trailing spaces with int or float field.

Args:
  • records: A Tensor of type string. Each string is a record/row in the csv and all records should have the same format.
  • record_defaults: A list of Tensor objects with types from: float32, int32, int64, string. One tensor per column of the input record, with either a scalar default value for that column or empty if the column is required.
  • field_delim: An optional string. Defaults to ",". delimiter to separate fields in a record.
  • name: A name for the operation (optional).
Returns:

A list of Tensor objects. Has the same type as record_defaults. Each tensor will have the same shape as records.


tf.decode_raw(bytes, out_type, little_endian=None, name=None)

Reinterpret the bytes of a string as a vector of numbers.

Args:
  • bytes: A Tensor of type string. All the elements must have the same length.
  • out_type: A tf.DType from: tf.float32, tf.float64, tf.int32, tf.uint8, tf.int16, tf.int8, tf.int64.
  • little_endian: An optional bool. Defaults to True. Whether the input bytes are in little-endian order. Ignored for out_types that are stored in a single byte like uint8.
  • name: A name for the operation (optional).
Returns:

A Tensor of type out_type. A Tensor with one more dimension than the input bytes. The added dimension will have size equal to the length of the elements of bytes divided by the number of bytes to represent out_type.


Example protocol buffer

TensorFlow's recommended format for training examples is serialized Example protocol buffers, described here. They contain Features, described here.


tf.parse_example(serialized, names=None, sparse_keys=None, sparse_types=None, dense_keys=None, dense_types=None, dense_defaults=None, dense_shapes=None, name='ParseExample')

Parses Example protos.

Parses a number of serialized [Example] (https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/example/example.proto) protos given in serialized.

names may contain descriptive names for the corresponding serialized protos. These may be useful for debugging purposes, but they have no effect on the output. If not None, names must be the same length as serialized.

This op parses serialized examples into a dictionary mapping keys to Tensor and SparseTensor objects respectively, depending on whether the keys appear in sparse_keys or dense_keys.

The key dense_keys[j] is mapped to a Tensor of type dense_types[j] and of shape (serialized.size(),) + dense_shapes[j].

dense_defaults provides defaults for values referenced using dense_keys. If a key is not present in this dictionary, the corresponding dense Feature is required in all elements of serialized.

dense_shapes[j] provides the shape of each Feature entry referenced by dense_keys[j]. The number of elements in the Feature corresponding to dense_key[j] must always have np.prod(dense_shapes[j]) entries. The returned Tensor for dense_key[j] has shape [N] + dense_shape[j], where N is the number of Examples in serialized.

The key sparse_keys[j] is mapped to a SparseTensor of type sparse_types[j]. The SparseTensor represents a ragged matrix. Its indices are [batch, index] where batch is the batch entry the value is from, and index is the value's index in the list of values associated with that feature and example.

Examples:

For example, if one expects a tf.float32 sparse feature ft and three serialized Examples are provided:

serialized = [
  features:
    { feature: [ key: { "ft" value: float_list: { value: [1.0, 2.0] } } ] },
  features:
    { feature: [] },
  features:
    { feature: [ key: { "ft" value: float_list: { value: [3.0] } } ] }
]

then the output will look like:

{"ft": SparseTensor(indices=[[0, 0], [0, 1], [2, 0]],
                    values=[1.0, 2.0, 3.0],
                    shape=(3, 2)) }

Given two Example input protos in serialized:

[
  features: {
    feature: { key: "kw" value: { bytes_list: { value: [ "knit", "big" ] } } }
    feature: { key: "gps" value: { float_list: { value: [] } } }
  },
  features: {
    feature: { key: "kw" value: { bytes_list: { value: [ "emmy" ] } } }
    feature: { key: "dank" value: { int64_list: { value: [ 42 ] } } }
    feature: { key: "gps" value: { } }
  }
]

And arguments

  names: ["input0", "input1"],
  sparse_keys: ["kw", "dank", "gps"]
  sparse_types: [DT_STRING, DT_INT64, DT_FLOAT]

Then the output is a dictionary:

{
  "kw": SparseTensor(
      indices=[[0, 0], [0, 1], [1, 0]],
      values=["knit", "big", "emmy"]
      shape=[2, 2]),
  "dank": SparseTensor(
      indices=[[1, 0]],
      values=[42],
      shape=[2, 1]),
  "gps": SparseTensor(
      indices=[],
      values=[],
      shape=[2, 0]),
}

For dense results in two serialized Examples:

[
  features: {
    feature: { key: "age" value: { int64_list: { value: [ 0 ] } } }
    feature: { key: "gender" value: { bytes_list: { value: [ "f" ] } } }
   },
   features: {
    feature: { key: "age" value: { int64_list: { value: [] } } }
    feature: { key: "gender" value: { bytes_list: { value: [ "f" ] } } }
  }
]

We can use arguments:

names: ["input0", "input1"],
dense_keys: np.array(["age", "gender"]),
dense_types: [tf.int64, tf.string],
dense_defaults: {
  "age": -1  # "age" defaults to -1 if missing
             # "gender" has no specified default so it's required
}
dense_shapes: [(1,), (1,)],  # age, gender, label, weight

And the expected output is:

{
  "age": [[0], [-1]],
  "gender": [["f"], ["f"]],
}
Args:
  • serialized: A list of strings, a batch of binary serialized Example protos.
  • names: A list of strings, the names of the serialized protos.
  • sparse_keys: A list of string keys in the examples' features. The results for these keys will be returned as SparseTensor objects.
  • sparse_types: A list of DTypes of the same length as sparse_keys. Only tf.float32 (FloatList), tf.int64 (Int64List), and tf.string (BytesList) are supported.
  • dense_keys: A list of string keys in the examples' features. The results for these keys will be returned as Tensors
  • dense_types: A list of DTypes of the same length as dense_keys. Only tf.float32 (FloatList), tf.int64 (Int64List), and tf.string (BytesList) are supported.
  • dense_defaults: A dict mapping string keys to Tensors. The keys of the dict must match the dense_keys of the feature.
  • dense_shapes: A list of tuples with the same length as dense_keys. The shape of the data for each dense feature referenced by dense_keys.
  • name: A name for this operation (optional).
Returns:

A dict mapping keys to Tensors and SparseTensors.

Raises:
  • ValueError: If sparse and dense key sets intersect, or input lengths do not match up.

tf.parse_single_example(serialized, names=None, sparse_keys=None, sparse_types=None, dense_keys=None, dense_types=None, dense_defaults=None, dense_shapes=None, name='ParseSingleExample')

Parses a single Example proto.

Similar to parse_example, except:

For dense tensors, the returned Tensor is identical to the output of parse_example, except there is no batch dimension, the output shape is the same as the shape given in dense_shape.

For SparseTensors, the first (batch) column of the indices matrix is removed (the indices matrix is a column vector), the values vector is unchanged, and the first (batch_size) entry of the shape vector is removed (it is now a single element vector).

See also parse_example.

Args:
  • serialized: A scalar string, a single serialized Example. See parse_example documentation for more details.
  • names: (Optional) A scalar string, the associated name. See parse_example documentation for more details.
  • sparse_keys: See parse_example documentation for more details.
  • sparse_types: See parse_example documentation for more details.
  • dense_keys: See parse_example documentation for more details.
  • dense_types: See parse_example documentation for more details.
  • dense_defaults: See parse_example documentation for more details.
  • dense_shapes: See parse_example documentation for more details.
  • name: A name for this operation (optional).
Returns:

A dictionary mapping keys to Tensors and SparseTensors.

Raises:
  • ValueError: if "scalar" or "names" have known shapes, and are not scalars.

Queues

TensorFlow provides several implementations of 'Queues', which are structures within the TensorFlow computation graph to stage pipelines of tensors together. The following describe the basic Queue interface and some implementations. To see an example use, see Threading and Queues.


class tf.QueueBase

Base class for queue implementations.

A queue is a TensorFlow data structure that stores tensors across multiple steps, and exposes operations that enqueue and dequeue tensors.

Each queue element is a tuple of one or more tensors, where each tuple component has a static dtype, and may have a static shape. The queue implementations support versions of enqueue and dequeue that handle single elements, versions that support enqueuing and dequeuing a batch of elements at once.

See tf.FIFOQueue and tf.RandomShuffleQueue for concrete implementations of this class, and instructions on how to create them.


tf.QueueBase.enqueue(vals, name=None)

Enqueues one element to this queue.

If the queue is full when this operation executes, it will block until the element has been enqueued.

Args:
  • vals: The tuple of Tensor objects to be enqueued.
  • name: A name for the operation (optional).
Returns:

The operation that enqueues a new tuple of tensors to the queue.


tf.QueueBase.enqueue_many(vals, name=None)

Enqueues zero or elements to this queue.

This operation slices each component tensor along the 0th dimension to make multiple queue elements. All of the tensors in vals must have the same size in the 0th dimension.

If the queue is full when this operation executes, it will block until all of the elements have been enqueued.

Args:
  • vals: The tensor or tuple of tensors from which the queue elements are taken.
  • name: A name for the operation (optional).
Returns:

The operation that enqueues a batch of tuples of tensors to the queue.


tf.QueueBase.dequeue(name=None)

Dequeues one element from this queue.

If the queue is empty when this operation executes, it will block until there is an element to dequeue.

Args:
  • name: A name for the operation (optional).
Returns:

The tuple of tensors that was dequeued.


tf.QueueBase.dequeue_many(n, name=None)

Dequeues and concatenates n elements from this queue.

This operation concatenates queue-element component tensors along the 0th dimension to make a single component tensor. All of the components in the dequeued tuple will have size n in the 0th dimension.

If the queue contains fewer than n elements when this operation executes, it will block until n elements have been dequeued.

Args:
  • n: A scalar Tensor containing the number of elements to dequeue.
  • name: A name for the operation (optional).
Returns:

The tuple of concatenated tensors that was dequeued.


tf.QueueBase.size(name=None)

Compute the number of elements in this queue.

Args:
  • name: A name for the operation (optional).
Returns:

A scalar tensor containing the number of elements in this queue.


tf.QueueBase.close(cancel_pending_enqueues=False, name=None)

Closes this queue.

This operation signals that no more elements will be enqueued in the given queue. Subsequent enqueue and enqueue_many operations will fail. Subsequent dequeue and dequeue_many operations will continue to succeed if sufficient elements remain in the queue. Subsequent dequeue and dequeue_many operations that would block will fail immediately.

If cancel_pending_enqueues is True, all pending requests will also be cancelled.

Args:
  • cancel_pending_enqueues: (Optional.) A boolean, defaulting to False (described above).
  • name: A name for the operation (optional).
Returns:

The operation that closes the queue.

Other Methods


tf.QueueBase.__init__(dtypes, shapes, queue_ref)

Constructs a queue object from a queue reference.

Args:
  • dtypes: A list of types. The length of dtypes must equal the number of tensors in each element.
  • shapes: Constraints on the shapes of tensors in an element: A list of shape tuples or None. This list is the same length as dtypes. If the shape of any tensors in the element are constrained, all must be; shapes can be None if the shapes should not be constrained.
  • queue_ref: The queue reference, i.e. the output of the queue op.

tf.QueueBase.dtypes

The list of dtypes for each component of a queue element.


tf.QueueBase.name

The name of the underlying queue.


tf.QueueBase.queue_ref

The underlying queue reference.


class tf.FIFOQueue

A queue implementation that dequeues elements in first-in-first out order.

See tf.QueueBase for a description of the methods on this class.


tf.FIFOQueue.__init__(capacity, dtypes, shapes=None, shared_name=None, name='fifo_queue')

Creates a queue that dequeues elements in a first-in first-out order.

A FIFOQueue has bounded capacity; supports multiple concurrent producers and consumers; and provides exactly-once delivery.

A FIFOQueue holds a list of up to capacity elements. Each element is a fixed-length tuple of tensors whose dtypes are described by dtypes, and whose shapes are optionally described by the shapes argument.

If the shapes argument is specified, each component of a queue element must have the respective fixed shape. If it is unspecified, different queue elements may have different shapes, but the use of dequeue_many is disallowed.

Args:
  • capacity: An integer. The upper bound on the number of elements that may be stored in this queue.
  • dtypes: A list of DType objects. The length of dtypes must equal the number of tensors in each queue element.
  • shapes: (Optional.) A list of fully-defined TensorShape objects, with the same length as dtypes or None.
  • shared_name: (Optional.) If non-empty, this queue will be shared under the given name across multiple sessions.
  • name: Optional name for the queue operation.

class tf.RandomShuffleQueue

A queue implementation that dequeues elements in a random order.

See tf.QueueBase for a description of the methods on this class.


tf.RandomShuffleQueue.__init__(capacity, min_after_dequeue, dtypes, shapes=None, seed=None, shared_name=None, name='random_shuffle_queue')

Create a queue that dequeues elements in a random order.

A RandomShuffleQueue has bounded capacity; supports multiple concurrent producers and consumers; and provides exactly-once delivery.

A RandomShuffleQueue holds a list of up to capacity elements. Each element is a fixed-length tuple of tensors whose dtypes are described by dtypes, and whose shapes are optionally described by the shapes argument.

If the shapes argument is specified, each component of a queue element must have the respective fixed shape. If it is unspecified, different queue elements may have different shapes, but the use of dequeue_many is disallowed.

The min_after_dequeue argument allows the caller to specify a minimum number of elements that will remain in the queue after a dequeue or dequeue_many operation completes, to ensure a minimum level of mixing of elements. This invariant is maintained by blocking those operations until sufficient elements have been enqueued. The min_after_dequeue argument is ignored after the queue has been closed.

Args:
  • capacity: An integer. The upper bound on the number of elements that may be stored in this queue.
  • min_after_dequeue: An integer (described above).
  • dtypes: A list of DType objects. The length of dtypes must equal the number of tensors in each queue element.
  • shapes: (Optional.) A list of fully-defined TensorShape objects, with the same length as dtypes or None.
  • seed: A Python integer. Used to create a random seed. See set_random_seed for behavior.
  • shared_name: (Optional.) If non-empty, this queue will be shared under the given name across multiple sessions.
  • name: Optional name for the queue operation.

Dealing with the filesystem


tf.matching_files(pattern, name=None)

Returns the set of files matching a pattern.

Note that this routine only supports wildcard characters in the basename portion of the pattern, not in the directory portion.

Args:
  • pattern: A Tensor of type string. A (scalar) shell wildcard pattern.
  • name: A name for the operation (optional).
Returns:

A Tensor of type string. A vector of matching filenames.


tf.read_file(filename, name=None)

Reads and outputs the entire contents of the input filename.

Args:
  • filename: A Tensor of type string.
  • name: A name for the operation (optional).
Returns:

A Tensor of type string.

Input pipeline

TensorFlow functions for setting up an input-prefetching pipeline. Please see the reading data how-to for context.

Beginning of an input pipeline

The "producer" functions add a queue to the graph and a corresponding QueueRunner for running the subgraph that fills that queue.


tf.train.match_filenames_once(pattern, name=None)

Save the list of files matching pattern, so it is only computed once.

Args:
  • pattern: A file pattern (glob).
  • name: A name for the operations (optional).
Returns:

A variable that is initialized to the list of files matching pattern.


tf.train.limit_epochs(tensor, num_epochs=None, name=None)

Returns tensor num_epochs times and then raises an OutOfRange error.

Args:
  • tensor: Any Tensor.
  • num_epochs: An integer (optional). If specified, limits the number of steps the output tensor may be evaluated.
  • name: A name for the operations (optional).
Returns:

tensor or OutOfRange.


tf.train.range_input_producer(limit, num_epochs=None, shuffle=True, seed=None, capacity=32, name=None)

Produces the integers from 0 to limit-1 in a queue.

Args:
  • limit: An int32 scalar tensor.
  • num_epochs: An integer (optional). If specified, range_input_producer produces each integer num_epochs times before generating an OutOfRange error. If not specified, range_input_producer can cycle through the integers an unlimited number of times.
  • shuffle: Boolean. If true, the integers are randomly shuffled within each epoch.
  • seed: An integer (optional). Seed used if shuffle == True.
  • capacity: An integer. Sets the queue capacity.
  • name: A name for the operations (optional).
Returns:

A Queue with the output integers. A QueueRunner for the Queue is added to the current Graph's QUEUE_RUNNER collection.


tf.train.slice_input_producer(tensor_list, num_epochs=None, shuffle=True, seed=None, capacity=32, name=None)

Produces a slice of each Tensor in tensor_list.

Implemented using a Queue -- a QueueRunner for the Queue is added to the current Graph's QUEUE_RUNNER collection.

Args:
  • tensor_list: A list of Tensors. Every Tensor in tensor_list must have the same size in the first dimension.
  • num_epochs: An integer (optional). If specified, slice_input_producer produces each slice num_epochs times before generating an OutOfRange error. If not specified, slice_input_producer can cycle through the slices an unlimited number of times.
  • seed: An integer (optional). Seed used if shuffle == True.
  • capacity: An integer. Sets the queue capacity.
  • name: A name for the operations (optional).
Returns:

A list of tensors, one for each element of tensor_list. If the tensor in tensor_list has shape [N, a, b, .., z], then the corresponding output tensor will have shape [a, b, ..., z].


tf.train.string_input_producer(string_tensor, num_epochs=None, shuffle=True, seed=None, capacity=32, name=None)

Output strings (e.g. filenames) to a queue for an input pipeline.

Args:
  • string_tensor: A 1-D string tensor with the strings to produce.
  • num_epochs: An integer (optional). If specified, string_input_producer produces each string from string_tensor num_epochs times before generating an OutOfRange error. If not specified, string_input_producer can cycle through the strings in string_tensor an unlimited number of times.
  • shuffle: Boolean. If true, the strings are randomly shuffled within each epoch.
  • seed: An integer (optional). Seed used if shuffle == True.
  • capacity: An integer. Sets the queue capacity.
  • name: A name for the operations (optional).
Returns:

A queue with the output strings. A QueueRunner for the Queue is added to the current Graph's QUEUE_RUNNER collection.

Batching at the end of an input pipeline

These functions add a queue to the graph to assemble a batch of examples, with possible shuffling. They also add a QueueRunner for running the subgraph that fills that queue.

Use batch or batch_join for batching examples that have already been well shuffled. Use shuffle_batch or shuffle_batch_join for examples that would benefit from additional shuffling.

Use batch or shuffle_batch if you want a single thread producing examples to batch, or if you have a single subgraph producing examples but you want to run it in N threads (where you increase N until it can keep the queue full). Use batch_join or shuffle_batch_join if you have N different subgraphs producing examples to batch and you want them run by N threads.


tf.train.batch(tensor_list, batch_size, num_threads=1, capacity=32, enqueue_many=False, shapes=None, name=None)

Creates batches of tensors in tensor_list.

This function is implemented using a queue. A QueueRunner for the queue is added to the current Graph's QUEUE_RUNNER collection.

If enqueue_many is False, tensor_list is assumed to represent a single example. An input tensor with shape [x, y, z] will be output as a tensor with shape [batch_size, x, y, z].

If enqueue_many is True, tensor_list is assumed to represent a batch of examples, where the first dimension is indexed by example, and all members of tensor_list should have the same size in the first dimension. If an input tensor has shape [*, x, y, z], the output will have shape [batch_size, x, y, z]. The capacity argument controls the how long the prefetching is allowed to grow the queues.

Args:
  • tensor_list: The list of tensors to enqueue.
  • batch_size: The new batch size pulled from the queue.
  • num_threads: The number of threads enqueuing tensor_list.
  • capacity: An integer. The maximum number of elements in the queue.
  • enqueue_many: Whether each tensor in tensor_list is a single example.
  • shapes: (Optional) The shapes for each example. Defaults to the inferred shapes for tensor_list.
  • name: (Optional) A name for the operations.
Returns:

A list of tensors with the same number and types as tensor_list.


tf.train.batch_join(tensor_list_list, batch_size, capacity=32, enqueue_many=False, shapes=None, name=None)

Runs a list of tensors to fill a queue to create batches of examples.

Enqueues a different list of tensors in different threads. Implemented using a queue -- a QueueRunner for the queue is added to the current Graph's QUEUE_RUNNER collection.

len(tensor_list_list) threads will be started, with thread i enqueuing the tensors from tensor_list_list[i]. tensor_list_list[i1][j] must match tensor_list_list[i2][j] in type and shape, except in the first dimension if enqueue_many is true.

If enqueue_many is False, each tensor_list_list[i] is assumed to represent a single example. An input tensor x will be output as a tensor with shape [batch_size] + x.shape.

If enqueue_many is True, tensor_list_list[i] is assumed to represent a batch of examples, where the first dimension is indexed by example, and all members of tensor_list_list[i] should have the same size in the first dimension. The slices of any input tensor x are treated as examples, and the output tensors will have shape [batch_size] + x.shape[1:].

The capacity argument controls the how long the prefetching is allowed to grow the queues.

Args:
  • tensor_list_list: A list of tuples of tensors to enqueue.
  • batch_size: An integer. The new batch size pulled from the queue.
  • capacity: An integer. The maximum number of elements in the queue.
  • enqueue_many: Whether each tensor in tensor_list_list is a single example.
  • shapes: (Optional) The shapes for each example. Defaults to the inferred shapes for tensor_list_list[i].
  • name: (Optional) A name for the operations.
Returns:

A list of tensors with the same number and types as tensor_list_list[i].


tf.train.shuffle_batch(tensor_list, batch_size, capacity, min_after_dequeue, num_threads=1, seed=None, enqueue_many=False, shapes=None, name=None)

Creates batches by randomly shuffling tensors.

This function adds the following to the current Graph:

  • A shuffling queue into which tensors from tensor_list are enqueued.
  • A dequeue_many operation to create batches from the queue.
  • A QueueRunner to QUEUE_RUNNER collection, to enqueue the tensors from tensor_list.

If enqueue_many is False, tensor_list is assumed to represent a single example. An input tensor with shape [x, y, z] will be output as a tensor with shape [batch_size, x, y, z].

If enqueue_many is True, tensor_list is assumed to represent a batch of examples, where the first dimension is indexed by example, and all members of tensor_list should have the same size in the first dimension. If an input tensor has shape [*, x, y, z], the output will have shape [batch_size, x, y, z].

The capacity argument controls the how long the prefetching is allowed to grow the queues.

For example:

# Creates batches of 32 images and 32 labels.
image_batch, label_batch = tf.train.shuffle_batch(
      [single_image, single_label],
      batch_size=32,
      num_threads=4,
      capacity=50000,
      min_after_dequeue=10000)
Args:
  • tensor_list: The list of tensors to enqueue.
  • batch_size: The new batch size pulled from the queue.
  • capacity: An integer. The maximum number of elements in the queue.
  • min_after_dequeue: Minimum number elements in the queue after a dequeue, used to ensure a level of mixing of elements.
  • num_threads: The number of threads enqueuing tensor_list.
  • seed: Seed for the random shuffling within the queue.
  • enqueue_many: Whether each tensor in tensor_list is a single example.
  • shapes: (Optional) The shapes for each example. Defaults to the inferred shapes for tensor_list.
  • name: (Optional) A name for the operations.
Returns:

A list of tensors with the same number and types as tensor_list.


tf.train.shuffle_batch_join(tensor_list_list, batch_size, capacity, min_after_dequeue, seed=None, enqueue_many=False, shapes=None, name=None)

Create batches by randomly shuffling tensors.

This version enqueues a different list of tensors in different threads. It adds the following to the current Graph:

  • A shuffling queue into which tensors from tensor_list_list are enqueued.
  • A dequeue_many operation to create batches from the queue.
  • A QueueRunner to QUEUE_RUNNER collection, to enqueue the tensors from tensor_list_list.

len(tensor_list_list) threads will be started, with thread i enqueuing the tensors from tensor_list_list[i]. tensor_list_list[i1][j] must match tensor_list_list[i2][j] in type and shape, except in the first dimension if enqueue_many is true.

If enqueue_many is False, each tensor_list_list[i] is assumed to represent a single example. An input tensor with shape [x, y, z] will be output as a tensor with shape [batch_size, x, y, z].

If enqueue_many is True, tensor_list_list[i] is assumed to represent a batch of examples, where the first dimension is indexed by example, and all members of tensor_list_list[i] should have the same size in the first dimension. If an input tensor has shape [*, x, y, z], the output will have shape [batch_size, x, y, z].

The capacity argument controls the how long the prefetching is allowed to grow the queues.

Args:
  • tensor_list_list: A list of tuples of tensors to enqueue.
  • batch_size: An integer. The new batch size pulled from the queue.
  • capacity: An integer. The maximum number of elements in the queue.
  • min_after_dequeue: Minimum number elements in the queue after a dequeue, used to ensure a level of mixing of elements.
  • seed: Seed for the random shuffling within the queue.
  • enqueue_many: Whether each tensor in tensor_list_list is a single example.
  • shapes: (Optional) The shapes for each example. Defaults to the inferred shapes for tensor_list_list[i].
  • name: (Optional) A name for the operations.
Returns:

A list of tensors with the same number and types as tensor_list_list[i].