Inputs and Readers
Note: Functions taking Tensor
arguments can also take anything accepted by tf.convert_to_tensor
.
Contents
Inputs and Readers
- Placeholders
tf.placeholder(dtype, shape=None, name=None)
- Readers
class tf.ReaderBase
class tf.TextLineReader
class tf.WholeFileReader
class tf.IdentityReader
class tf.TFRecordReader
class tf.FixedLengthRecordReader
- Converting
tf.decode_csv(records, record_defaults, field_delim=None, name=None)
tf.decode_raw(bytes, out_type, little_endian=None, name=None)
- Example protocol buffer
tf.parse_example(serialized, names=None, sparse_keys=None, sparse_types=None, dense_keys=None, dense_types=None, dense_defaults=None, dense_shapes=None, name='ParseExample')
tf.parse_single_example(serialized, names=None, sparse_keys=None, sparse_types=None, dense_keys=None, dense_types=None, dense_defaults=None, dense_shapes=None, name='ParseSingleExample')
- Queues
class tf.QueueBase
class tf.FIFOQueue
class tf.RandomShuffleQueue
- Dealing with the filesystem
tf.matching_files(pattern, name=None)
tf.read_file(filename, name=None)
- Input pipeline
- Beginning of an input pipeline
tf.train.match_filenames_once(pattern, name=None)
tf.train.limit_epochs(tensor, num_epochs=None, name=None)
tf.train.range_input_producer(limit, num_epochs=None, shuffle=True, seed=None, capacity=32, name=None)
tf.train.slice_input_producer(tensor_list, num_epochs=None, shuffle=True, seed=None, capacity=32, name=None)
tf.train.string_input_producer(string_tensor, num_epochs=None, shuffle=True, seed=None, capacity=32, name=None)
- Batching at the end of an input pipeline
tf.train.batch(tensor_list, batch_size, num_threads=1, capacity=32, enqueue_many=False, shapes=None, name=None)
tf.train.batch_join(tensor_list_list, batch_size, capacity=32, enqueue_many=False, shapes=None, name=None)
tf.train.shuffle_batch(tensor_list, batch_size, capacity, min_after_dequeue, num_threads=1, seed=None, enqueue_many=False, shapes=None, name=None)
tf.train.shuffle_batch_join(tensor_list_list, batch_size, capacity, min_after_dequeue, seed=None, enqueue_many=False, shapes=None, name=None)
Placeholders
TensorFlow provides a placeholder operation that must be fed with data on execution. For more info, see the section on Feeding data.
tf.placeholder(dtype, shape=None, name=None)
Inserts a placeholder for a tensor that will be always fed.
Important: This tensor will produce an error if evaluated. Its value must be fed using the feed_dict
optional argument to Session.run()
, Tensor.eval()
, or Operation.run()
.
For example:
x = tf.placeholder(float, shape=(1024, 1024))
y = tf.matmul(x, x)
with tf.Session() as sess:
print sess.run(y) # ERROR: will fail because x was not fed.
rand_array = np.random.rand(1024, 1024)
print sess.run(y, feed_dict={x: rand_array}) # Will succeed.
Args:
dtype
: The type of elements in the tensor to be fed.shape
: The shape of the tensor to be fed (optional). If the shape is not specified, you can feed a tensor of any shape.name
: A name for the operation (optional).
Returns:
A Tensor
that may be used as a handle for feeding a value, but not evaluated directly.
Readers
TensorFlow provides a set of Reader classes for reading data formats. For more information on inputs and readers, see Reading data.
class tf.ReaderBase
Base class for different Reader types, that produce a record every step.
Conceptually, Readers convert string 'work units' into records (key, value pairs). Typically the 'work units' are filenames and the records are extracted from the contents of those files. We want a single record produced per step, but a work unit can correspond to many records.
Therefore we introduce some decoupling using a queue. The queue contains the work units and the Reader dequeues from the queue when it is asked to produce a record (via Read()) but it has finished the last work unit.
tf.ReaderBase.__init__(reader_ref, supports_serialize=False)
Creates a new ReaderBase.
Args:
reader_ref
: The operation that implements the reader.supports_serialize
: True if the reader implementation can serialize its state.
tf.ReaderBase.num_records_produced(name=None)
Returns the number of records this reader has produced.
This is the same as the number of Read executions that have succeeded.
Args:
name
: A name for the operation (optional).
Returns:
An int64 Tensor.
tf.ReaderBase.num_work_units_completed(name=None)
Returns the number of work units this reader has finished processing.
Args:
name
: A name for the operation (optional).
Returns:
An int64 Tensor.
tf.ReaderBase.read(queue, name=None)
Returns the next record (key, value pair) produced by a reader.
Will dequeue a work unit from queue if necessary (e.g. when the Reader needs to start reading from a new file since it has finished with the previous file).
Args:
queue
: A Queue or a mutable string Tensor representing a handle to a Queue, with string work items.name
: A name for the operation (optional).
Returns:
A tuple of Tensors (key, value).
key
: A string scalar Tensor.value
: A string scalar Tensor.
tf.ReaderBase.reader_ref
Op that implements the reader.
tf.ReaderBase.reset(name=None)
Restore a reader to its initial clean state.
Args:
name
: A name for the operation (optional).
Returns:
The created Operation.
tf.ReaderBase.restore_state(state, name=None)
Restore a reader to a previously saved state.
Not all Readers support being restored, so this can produce an Unimplemented error.
Args:
state
: A string Tensor. Result of a SerializeState of a Reader with matching type.name
: A name for the operation (optional).
Returns:
The created Operation.
tf.ReaderBase.serialize_state(name=None)
Produce a string tensor that encodes the state of a reader.
Not all Readers support being serialized, so this can produce an Unimplemented error.
Args:
name
: A name for the operation (optional).
Returns:
A string Tensor.
tf.ReaderBase.supports_serialize
Whether the Reader implementation can serialize its state.
class tf.TextLineReader
A Reader that outputs the lines of a file delimited by newlines.
Newlines are stripped from the output. See ReaderBase for supported methods.
tf.TextLineReader.__init__(skip_header_lines=None, name=None)
Create a TextLineReader.
Args:
skip_header_lines
: An optional int. Defaults to 0. Number of lines to skip from the beginning of every file.name
: A name for the operation (optional).
tf.TextLineReader.num_records_produced(name=None)
Returns the number of records this reader has produced.
This is the same as the number of Read executions that have succeeded.
Args:
name
: A name for the operation (optional).
Returns:
An int64 Tensor.
tf.TextLineReader.num_work_units_completed(name=None)
Returns the number of work units this reader has finished processing.
Args:
name
: A name for the operation (optional).
Returns:
An int64 Tensor.
tf.TextLineReader.read(queue, name=None)
Returns the next record (key, value pair) produced by a reader.
Will dequeue a work unit from queue if necessary (e.g. when the Reader needs to start reading from a new file since it has finished with the previous file).
Args:
queue
: A Queue or a mutable string Tensor representing a handle to a Queue, with string work items.name
: A name for the operation (optional).
Returns:
A tuple of Tensors (key, value).
key
: A string scalar Tensor.value
: A string scalar Tensor.
tf.TextLineReader.reader_ref
Op that implements the reader.
tf.TextLineReader.reset(name=None)
Restore a reader to its initial clean state.
Args:
name
: A name for the operation (optional).
Returns:
The created Operation.
tf.TextLineReader.restore_state(state, name=None)
Restore a reader to a previously saved state.
Not all Readers support being restored, so this can produce an Unimplemented error.
Args:
state
: A string Tensor. Result of a SerializeState of a Reader with matching type.name
: A name for the operation (optional).
Returns:
The created Operation.
tf.TextLineReader.serialize_state(name=None)
Produce a string tensor that encodes the state of a reader.
Not all Readers support being serialized, so this can produce an Unimplemented error.
Args:
name
: A name for the operation (optional).
Returns:
A string Tensor.
tf.TextLineReader.supports_serialize
Whether the Reader implementation can serialize its state.
class tf.WholeFileReader
A Reader that outputs the entire contents of a file as a value.
To use, enqueue filenames in a Queue. The output of Read will be a filename (key) and the contents of that file (value).
See ReaderBase for supported methods.
tf.WholeFileReader.__init__(name=None)
Create a WholeFileReader.
Args:
name
: A name for the operation (optional).
tf.WholeFileReader.num_records_produced(name=None)
Returns the number of records this reader has produced.
This is the same as the number of Read executions that have succeeded.
Args:
name
: A name for the operation (optional).
Returns:
An int64 Tensor.
tf.WholeFileReader.num_work_units_completed(name=None)
Returns the number of work units this reader has finished processing.
Args:
name
: A name for the operation (optional).
Returns:
An int64 Tensor.
tf.WholeFileReader.read(queue, name=None)
Returns the next record (key, value pair) produced by a reader.
Will dequeue a work unit from queue if necessary (e.g. when the Reader needs to start reading from a new file since it has finished with the previous file).
Args:
queue
: A Queue or a mutable string Tensor representing a handle to a Queue, with string work items.name
: A name for the operation (optional).
Returns:
A tuple of Tensors (key, value).
key
: A string scalar Tensor.value
: A string scalar Tensor.
tf.WholeFileReader.reader_ref
Op that implements the reader.
tf.WholeFileReader.reset(name=None)
Restore a reader to its initial clean state.
Args:
name
: A name for the operation (optional).
Returns:
The created Operation.
tf.WholeFileReader.restore_state(state, name=None)
Restore a reader to a previously saved state.
Not all Readers support being restored, so this can produce an Unimplemented error.
Args:
state
: A string Tensor. Result of a SerializeState of a Reader with matching type.name
: A name for the operation (optional).
Returns:
The created Operation.
tf.WholeFileReader.serialize_state(name=None)
Produce a string tensor that encodes the state of a reader.
Not all Readers support being serialized, so this can produce an Unimplemented error.
Args:
name
: A name for the operation (optional).
Returns:
A string Tensor.
tf.WholeFileReader.supports_serialize
Whether the Reader implementation can serialize its state.
class tf.IdentityReader
A Reader that outputs the queued work as both the key and value.
To use, enqueue strings in a Queue. Read will take the front work string and output (work, work).
See ReaderBase for supported methods.
tf.IdentityReader.__init__(name=None)
Create a IdentityReader.
Args:
name
: A name for the operation (optional).
tf.IdentityReader.num_records_produced(name=None)
Returns the number of records this reader has produced.
This is the same as the number of Read executions that have succeeded.
Args:
name
: A name for the operation (optional).
Returns:
An int64 Tensor.
tf.IdentityReader.num_work_units_completed(name=None)
Returns the number of work units this reader has finished processing.
Args:
name
: A name for the operation (optional).
Returns:
An int64 Tensor.
tf.IdentityReader.read(queue, name=None)
Returns the next record (key, value pair) produced by a reader.
Will dequeue a work unit from queue if necessary (e.g. when the Reader needs to start reading from a new file since it has finished with the previous file).
Args:
queue
: A Queue or a mutable string Tensor representing a handle to a Queue, with string work items.name
: A name for the operation (optional).
Returns:
A tuple of Tensors (key, value).
key
: A string scalar Tensor.value
: A string scalar Tensor.
tf.IdentityReader.reader_ref
Op that implements the reader.
tf.IdentityReader.reset(name=None)
Restore a reader to its initial clean state.
Args:
name
: A name for the operation (optional).
Returns:
The created Operation.
tf.IdentityReader.restore_state(state, name=None)
Restore a reader to a previously saved state.
Not all Readers support being restored, so this can produce an Unimplemented error.
Args:
state
: A string Tensor. Result of a SerializeState of a Reader with matching type.name
: A name for the operation (optional).
Returns:
The created Operation.
tf.IdentityReader.serialize_state(name=None)
Produce a string tensor that encodes the state of a reader.
Not all Readers support being serialized, so this can produce an Unimplemented error.
Args:
name
: A name for the operation (optional).
Returns:
A string Tensor.
tf.IdentityReader.supports_serialize
Whether the Reader implementation can serialize its state.
class tf.TFRecordReader
A Reader that outputs the records from a TFRecords file.
See ReaderBase for supported methods.
tf.TFRecordReader.__init__(name=None)
Create a TFRecordReader.
Args:
name
: A name for the operation (optional).
tf.TFRecordReader.num_records_produced(name=None)
Returns the number of records this reader has produced.
This is the same as the number of Read executions that have succeeded.
Args:
name
: A name for the operation (optional).
Returns:
An int64 Tensor.
tf.TFRecordReader.num_work_units_completed(name=None)
Returns the number of work units this reader has finished processing.
Args:
name
: A name for the operation (optional).
Returns:
An int64 Tensor.
tf.TFRecordReader.read(queue, name=None)
Returns the next record (key, value pair) produced by a reader.
Will dequeue a work unit from queue if necessary (e.g. when the Reader needs to start reading from a new file since it has finished with the previous file).
Args:
queue
: A Queue or a mutable string Tensor representing a handle to a Queue, with string work items.name
: A name for the operation (optional).
Returns:
A tuple of Tensors (key, value).
key
: A string scalar Tensor.value
: A string scalar Tensor.
tf.TFRecordReader.reader_ref
Op that implements the reader.
tf.TFRecordReader.reset(name=None)
Restore a reader to its initial clean state.
Args:
name
: A name for the operation (optional).
Returns:
The created Operation.
tf.TFRecordReader.restore_state(state, name=None)
Restore a reader to a previously saved state.
Not all Readers support being restored, so this can produce an Unimplemented error.
Args:
state
: A string Tensor. Result of a SerializeState of a Reader with matching type.name
: A name for the operation (optional).
Returns:
The created Operation.
tf.TFRecordReader.serialize_state(name=None)
Produce a string tensor that encodes the state of a reader.
Not all Readers support being serialized, so this can produce an Unimplemented error.
Args:
name
: A name for the operation (optional).
Returns:
A string Tensor.
tf.TFRecordReader.supports_serialize
Whether the Reader implementation can serialize its state.
class tf.FixedLengthRecordReader
A Reader that outputs fixed-length records from a file.
See ReaderBase for supported methods.
tf.FixedLengthRecordReader.__init__(record_bytes, header_bytes=None, footer_bytes=None, name=None)
Create a FixedLengthRecordReader.
Args:
record_bytes
: An int.header_bytes
: An optional int. Defaults to 0.footer_bytes
: An optional int. Defaults to 0.name
: A name for the operation (optional).
tf.FixedLengthRecordReader.num_records_produced(name=None)
Returns the number of records this reader has produced.
This is the same as the number of Read executions that have succeeded.
Args:
name
: A name for the operation (optional).
Returns:
An int64 Tensor.
tf.FixedLengthRecordReader.num_work_units_completed(name=None)
Returns the number of work units this reader has finished processing.
Args:
name
: A name for the operation (optional).
Returns:
An int64 Tensor.
tf.FixedLengthRecordReader.read(queue, name=None)
Returns the next record (key, value pair) produced by a reader.
Will dequeue a work unit from queue if necessary (e.g. when the Reader needs to start reading from a new file since it has finished with the previous file).
Args:
queue
: A Queue or a mutable string Tensor representing a handle to a Queue, with string work items.name
: A name for the operation (optional).
Returns:
A tuple of Tensors (key, value).
key
: A string scalar Tensor.value
: A string scalar Tensor.
tf.FixedLengthRecordReader.reader_ref
Op that implements the reader.
tf.FixedLengthRecordReader.reset(name=None)
Restore a reader to its initial clean state.
Args:
name
: A name for the operation (optional).
Returns:
The created Operation.
tf.FixedLengthRecordReader.restore_state(state, name=None)
Restore a reader to a previously saved state.
Not all Readers support being restored, so this can produce an Unimplemented error.
Args:
state
: A string Tensor. Result of a SerializeState of a Reader with matching type.name
: A name for the operation (optional).
Returns:
The created Operation.
tf.FixedLengthRecordReader.serialize_state(name=None)
Produce a string tensor that encodes the state of a reader.
Not all Readers support being serialized, so this can produce an Unimplemented error.
Args:
name
: A name for the operation (optional).
Returns:
A string Tensor.
tf.FixedLengthRecordReader.supports_serialize
Whether the Reader implementation can serialize its state.
Converting
TensorFlow provides several operations that you can use to convert various data formats into tensors.
tf.decode_csv(records, record_defaults, field_delim=None, name=None)
Convert CSV records to tensors. Each column maps to one tensor.
RFC 4180 format is expected for the CSV records. (https://tools.ietf.org/html/rfc4180) Note that we allow leading and trailing spaces with int or float field.
Args:
records
: ATensor
of typestring
. Each string is a record/row in the csv and all records should have the same format.record_defaults
: A list ofTensor
objects with types from:float32
,int32
,int64
,string
. One tensor per column of the input record, with either a scalar default value for that column or empty if the column is required.field_delim
: An optionalstring
. Defaults to","
. delimiter to separate fields in a record.name
: A name for the operation (optional).
Returns:
A list of Tensor
objects. Has the same type as record_defaults
. Each tensor will have the same shape as records.
tf.decode_raw(bytes, out_type, little_endian=None, name=None)
Reinterpret the bytes of a string as a vector of numbers.
Args:
bytes
: ATensor
of typestring
. All the elements must have the same length.out_type
: Atf.DType
from:tf.float32, tf.float64, tf.int32, tf.uint8, tf.int16, tf.int8, tf.int64
.little_endian
: An optionalbool
. Defaults toTrue
. Whether the input bytes are in little-endian order. Ignored for out_types that are stored in a single byte like uint8.name
: A name for the operation (optional).
Returns:
A Tensor
of type out_type
. A Tensor with one more dimension than the input bytes. The added dimension will have size equal to the length of the elements of bytes divided by the number of bytes to represent out_type.
Example protocol buffer
TensorFlow's recommended format for training examples is serialized Example
protocol buffers, described here. They contain Features
, described here.
tf.parse_example(serialized, names=None, sparse_keys=None, sparse_types=None, dense_keys=None, dense_types=None, dense_defaults=None, dense_shapes=None, name='ParseExample')
Parses Example
protos.
Parses a number of serialized [Example
] (https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/example/example.proto) protos given in serialized
.
names
may contain descriptive names for the corresponding serialized protos. These may be useful for debugging purposes, but they have no effect on the output. If not None
, names
must be the same length as serialized
.
This op parses serialized examples into a dictionary mapping keys to Tensor
and SparseTensor
objects respectively, depending on whether the keys appear in sparse_keys
or dense_keys
.
The key dense_keys[j]
is mapped to a Tensor
of type dense_types[j]
and of shape (serialized.size(),) + dense_shapes[j]
.
dense_defaults
provides defaults for values referenced using dense_keys
. If a key is not present in this dictionary, the corresponding dense Feature
is required in all elements of serialized
.
dense_shapes[j]
provides the shape of each Feature
entry referenced by dense_keys[j]
. The number of elements in the Feature
corresponding to dense_key[j]
must always have np.prod(dense_shapes[j])
entries. The returned Tensor
for dense_key[j]
has shape [N] + dense_shape[j]
, where N
is the number of Example
s in serialized
.
The key sparse_keys[j]
is mapped to a SparseTensor
of type sparse_types[j]
. The SparseTensor
represents a ragged matrix. Its indices are [batch, index]
where batch
is the batch entry the value is from, and index
is the value's index in the list of values associated with that feature and example.
Examples:
For example, if one expects a tf.float32
sparse feature ft
and three serialized Example
s are provided:
serialized = [
features:
{ feature: [ key: { "ft" value: float_list: { value: [1.0, 2.0] } } ] },
features:
{ feature: [] },
features:
{ feature: [ key: { "ft" value: float_list: { value: [3.0] } } ] }
]
then the output will look like:
{"ft": SparseTensor(indices=[[0, 0], [0, 1], [2, 0]],
values=[1.0, 2.0, 3.0],
shape=(3, 2)) }
Given two Example
input protos in serialized
:
[
features: {
feature: { key: "kw" value: { bytes_list: { value: [ "knit", "big" ] } } }
feature: { key: "gps" value: { float_list: { value: [] } } }
},
features: {
feature: { key: "kw" value: { bytes_list: { value: [ "emmy" ] } } }
feature: { key: "dank" value: { int64_list: { value: [ 42 ] } } }
feature: { key: "gps" value: { } }
}
]
And arguments
names: ["input0", "input1"],
sparse_keys: ["kw", "dank", "gps"]
sparse_types: [DT_STRING, DT_INT64, DT_FLOAT]
Then the output is a dictionary:
{
"kw": SparseTensor(
indices=[[0, 0], [0, 1], [1, 0]],
values=["knit", "big", "emmy"]
shape=[2, 2]),
"dank": SparseTensor(
indices=[[1, 0]],
values=[42],
shape=[2, 1]),
"gps": SparseTensor(
indices=[],
values=[],
shape=[2, 0]),
}
For dense results in two serialized Example
s:
[
features: {
feature: { key: "age" value: { int64_list: { value: [ 0 ] } } }
feature: { key: "gender" value: { bytes_list: { value: [ "f" ] } } }
},
features: {
feature: { key: "age" value: { int64_list: { value: [] } } }
feature: { key: "gender" value: { bytes_list: { value: [ "f" ] } } }
}
]
We can use arguments:
names: ["input0", "input1"],
dense_keys: np.array(["age", "gender"]),
dense_types: [tf.int64, tf.string],
dense_defaults: {
"age": -1 # "age" defaults to -1 if missing
# "gender" has no specified default so it's required
}
dense_shapes: [(1,), (1,)], # age, gender, label, weight
And the expected output is:
{
"age": [[0], [-1]],
"gender": [["f"], ["f"]],
}
Args:
serialized
: A list of strings, a batch of binary serializedExample
protos.names
: A list of strings, the names of the serialized protos.sparse_keys
: A list of string keys in the examples' features. The results for these keys will be returned asSparseTensor
objects.sparse_types
: A list ofDTypes
of the same length assparse_keys
. Onlytf.float32
(FloatList
),tf.int64
(Int64List
), andtf.string
(BytesList
) are supported.dense_keys
: A list of string keys in the examples' features. The results for these keys will be returned asTensor
sdense_types
: A list of DTypes of the same length asdense_keys
. Onlytf.float32
(FloatList
),tf.int64
(Int64List
), andtf.string
(BytesList
) are supported.dense_defaults
: A dict mapping string keys toTensor
s. The keys of the dict must match the dense_keys of the feature.dense_shapes
: A list of tuples with the same length asdense_keys
. The shape of the data for each dense feature referenced bydense_keys
.name
: A name for this operation (optional).
Returns:
A dict
mapping keys to Tensor
s and SparseTensor
s.
Raises:
ValueError
: If sparse and dense key sets intersect, or input lengths do not match up.
tf.parse_single_example(serialized, names=None, sparse_keys=None, sparse_types=None, dense_keys=None, dense_types=None, dense_defaults=None, dense_shapes=None, name='ParseSingleExample')
Parses a single Example
proto.
Similar to parse_example
, except:
For dense tensors, the returned Tensor
is identical to the output of parse_example
, except there is no batch dimension, the output shape is the same as the shape given in dense_shape
.
For SparseTensor
s, the first (batch) column of the indices matrix is removed (the indices matrix is a column vector), the values vector is unchanged, and the first (batch_size) entry of the shape vector is removed (it is now a single element vector).
See also parse_example
.
Args:
serialized
: A scalar string, a single serialized Example. See parse_example documentation for more details.names
: (Optional) A scalar string, the associated name. See parse_example documentation for more details.sparse_keys
: See parse_example documentation for more details.sparse_types
: See parse_example documentation for more details.dense_keys
: See parse_example documentation for more details.dense_types
: See parse_example documentation for more details.dense_defaults
: See parse_example documentation for more details.dense_shapes
: See parse_example documentation for more details.name
: A name for this operation (optional).
Returns:
A dictionary mapping keys to Tensors and SparseTensors.
Raises:
ValueError
: if "scalar" or "names" have known shapes, and are not scalars.
Queues
TensorFlow provides several implementations of 'Queues', which are structures within the TensorFlow computation graph to stage pipelines of tensors together. The following describe the basic Queue interface and some implementations. To see an example use, see Threading and Queues.
class tf.QueueBase
Base class for queue implementations.
A queue is a TensorFlow data structure that stores tensors across multiple steps, and exposes operations that enqueue and dequeue tensors.
Each queue element is a tuple of one or more tensors, where each tuple component has a static dtype, and may have a static shape. The queue implementations support versions of enqueue and dequeue that handle single elements, versions that support enqueuing and dequeuing a batch of elements at once.
See tf.FIFOQueue
and tf.RandomShuffleQueue
for concrete implementations of this class, and instructions on how to create them.
tf.QueueBase.enqueue(vals, name=None)
Enqueues one element to this queue.
If the queue is full when this operation executes, it will block until the element has been enqueued.
Args:
vals
: The tuple ofTensor
objects to be enqueued.name
: A name for the operation (optional).
Returns:
The operation that enqueues a new tuple of tensors to the queue.
tf.QueueBase.enqueue_many(vals, name=None)
Enqueues zero or elements to this queue.
This operation slices each component tensor along the 0th dimension to make multiple queue elements. All of the tensors in vals
must have the same size in the 0th dimension.
If the queue is full when this operation executes, it will block until all of the elements have been enqueued.
Args:
vals
: The tensor or tuple of tensors from which the queue elements are taken.name
: A name for the operation (optional).
Returns:
The operation that enqueues a batch of tuples of tensors to the queue.
tf.QueueBase.dequeue(name=None)
Dequeues one element from this queue.
If the queue is empty when this operation executes, it will block until there is an element to dequeue.
Args:
name
: A name for the operation (optional).
Returns:
The tuple of tensors that was dequeued.
tf.QueueBase.dequeue_many(n, name=None)
Dequeues and concatenates n
elements from this queue.
This operation concatenates queue-element component tensors along the 0th dimension to make a single component tensor. All of the components in the dequeued tuple will have size n
in the 0th dimension.
If the queue contains fewer than n
elements when this operation executes, it will block until n
elements have been dequeued.
Args:
n
: A scalarTensor
containing the number of elements to dequeue.name
: A name for the operation (optional).
Returns:
The tuple of concatenated tensors that was dequeued.
tf.QueueBase.size(name=None)
Compute the number of elements in this queue.
Args:
name
: A name for the operation (optional).
Returns:
A scalar tensor containing the number of elements in this queue.
tf.QueueBase.close(cancel_pending_enqueues=False, name=None)
Closes this queue.
This operation signals that no more elements will be enqueued in the given queue. Subsequent enqueue
and enqueue_many
operations will fail. Subsequent dequeue
and dequeue_many
operations will continue to succeed if sufficient elements remain in the queue. Subsequent dequeue
and dequeue_many
operations that would block will fail immediately.
If cancel_pending_enqueues
is True
, all pending requests will also be cancelled.
Args:
cancel_pending_enqueues
: (Optional.) A boolean, defaulting toFalse
(described above).name
: A name for the operation (optional).
Returns:
The operation that closes the queue.
Other Methods
tf.QueueBase.__init__(dtypes, shapes, queue_ref)
Constructs a queue object from a queue reference.
Args:
dtypes
: A list of types. The length of dtypes must equal the number of tensors in each element.shapes
: Constraints on the shapes of tensors in an element: A list of shape tuples or None. This list is the same length as dtypes. If the shape of any tensors in the element are constrained, all must be; shapes can be None if the shapes should not be constrained.queue_ref
: The queue reference, i.e. the output of the queue op.
tf.QueueBase.dtypes
The list of dtypes for each component of a queue element.
tf.QueueBase.name
The name of the underlying queue.
tf.QueueBase.queue_ref
The underlying queue reference.
class tf.FIFOQueue
A queue implementation that dequeues elements in first-in-first out order.
See tf.QueueBase
for a description of the methods on this class.
tf.FIFOQueue.__init__(capacity, dtypes, shapes=None, shared_name=None, name='fifo_queue')
Creates a queue that dequeues elements in a first-in first-out order.
A FIFOQueue
has bounded capacity; supports multiple concurrent producers and consumers; and provides exactly-once delivery.
A FIFOQueue
holds a list of up to capacity
elements. Each element is a fixed-length tuple of tensors whose dtypes are described by dtypes
, and whose shapes are optionally described by the shapes
argument.
If the shapes
argument is specified, each component of a queue element must have the respective fixed shape. If it is unspecified, different queue elements may have different shapes, but the use of dequeue_many
is disallowed.
Args:
capacity
: An integer. The upper bound on the number of elements that may be stored in this queue.dtypes
: A list ofDType
objects. The length ofdtypes
must equal the number of tensors in each queue element.shapes
: (Optional.) A list of fully-definedTensorShape
objects, with the same length asdtypes
orNone
.shared_name
: (Optional.) If non-empty, this queue will be shared under the given name across multiple sessions.name
: Optional name for the queue operation.
class tf.RandomShuffleQueue
A queue implementation that dequeues elements in a random order.
See tf.QueueBase
for a description of the methods on this class.
tf.RandomShuffleQueue.__init__(capacity, min_after_dequeue, dtypes, shapes=None, seed=None, shared_name=None, name='random_shuffle_queue')
Create a queue that dequeues elements in a random order.
A RandomShuffleQueue
has bounded capacity; supports multiple concurrent producers and consumers; and provides exactly-once delivery.
A RandomShuffleQueue
holds a list of up to capacity
elements. Each element is a fixed-length tuple of tensors whose dtypes are described by dtypes
, and whose shapes are optionally described by the shapes
argument.
If the shapes
argument is specified, each component of a queue element must have the respective fixed shape. If it is unspecified, different queue elements may have different shapes, but the use of dequeue_many
is disallowed.
The min_after_dequeue
argument allows the caller to specify a minimum number of elements that will remain in the queue after a dequeue
or dequeue_many
operation completes, to ensure a minimum level of mixing of elements. This invariant is maintained by blocking those operations until sufficient elements have been enqueued. The min_after_dequeue
argument is ignored after the queue has been closed.
Args:
capacity
: An integer. The upper bound on the number of elements that may be stored in this queue.min_after_dequeue
: An integer (described above).dtypes
: A list ofDType
objects. The length ofdtypes
must equal the number of tensors in each queue element.shapes
: (Optional.) A list of fully-definedTensorShape
objects, with the same length asdtypes
orNone
.seed
: A Python integer. Used to create a random seed. Seeset_random_seed
for behavior.shared_name
: (Optional.) If non-empty, this queue will be shared under the given name across multiple sessions.name
: Optional name for the queue operation.
Dealing with the filesystem
tf.matching_files(pattern, name=None)
Returns the set of files matching a pattern.
Note that this routine only supports wildcard characters in the basename portion of the pattern, not in the directory portion.
Args:
pattern
: ATensor
of typestring
. A (scalar) shell wildcard pattern.name
: A name for the operation (optional).
Returns:
A Tensor
of type string
. A vector of matching filenames.
tf.read_file(filename, name=None)
Reads and outputs the entire contents of the input filename.
Args:
filename
: ATensor
of typestring
.name
: A name for the operation (optional).
Returns:
A Tensor
of type string
.
Input pipeline
TensorFlow functions for setting up an input-prefetching pipeline. Please see the reading data how-to for context.
Beginning of an input pipeline
The "producer" functions add a queue to the graph and a corresponding QueueRunner
for running the subgraph that fills that queue.
tf.train.match_filenames_once(pattern, name=None)
Save the list of files matching pattern, so it is only computed once.
Args:
pattern
: A file pattern (glob).name
: A name for the operations (optional).
Returns:
A variable that is initialized to the list of files matching pattern.
tf.train.limit_epochs(tensor, num_epochs=None, name=None)
Returns tensor num_epochs times and then raises an OutOfRange error.
Args:
tensor
: Any Tensor.num_epochs
: An integer (optional). If specified, limits the number of steps the output tensor may be evaluated.name
: A name for the operations (optional).
Returns:
tensor or OutOfRange.
tf.train.range_input_producer(limit, num_epochs=None, shuffle=True, seed=None, capacity=32, name=None)
Produces the integers from 0 to limit-1 in a queue.
Args:
limit
: An int32 scalar tensor.num_epochs
: An integer (optional). If specified,range_input_producer
produces each integernum_epochs
times before generating an OutOfRange error. If not specified,range_input_producer
can cycle through the integers an unlimited number of times.shuffle
: Boolean. If true, the integers are randomly shuffled within each epoch.seed
: An integer (optional). Seed used if shuffle == True.capacity
: An integer. Sets the queue capacity.name
: A name for the operations (optional).
Returns:
A Queue with the output integers. A QueueRunner for the Queue is added to the current Graph's QUEUE_RUNNER collection.
tf.train.slice_input_producer(tensor_list, num_epochs=None, shuffle=True, seed=None, capacity=32, name=None)
Produces a slice of each Tensor in tensor_list.
Implemented using a Queue -- a QueueRunner for the Queue is added to the current Graph's QUEUE_RUNNER collection.
Args:
tensor_list
: A list of Tensors. Every Tensor in tensor_list must have the same size in the first dimension.num_epochs
: An integer (optional). If specified,slice_input_producer
produces each slicenum_epochs
times before generating an OutOfRange error. If not specified,slice_input_producer
can cycle through the slices an unlimited number of times.seed
: An integer (optional). Seed used if shuffle == True.capacity
: An integer. Sets the queue capacity.name
: A name for the operations (optional).
Returns:
A list of tensors, one for each element of tensor_list. If the tensor in tensor_list has shape [N, a, b, .., z], then the corresponding output tensor will have shape [a, b, ..., z].
tf.train.string_input_producer(string_tensor, num_epochs=None, shuffle=True, seed=None, capacity=32, name=None)
Output strings (e.g. filenames) to a queue for an input pipeline.
Args:
string_tensor
: A 1-D string tensor with the strings to produce.num_epochs
: An integer (optional). If specified,string_input_producer
produces each string fromstring_tensor
num_epochs
times before generating an OutOfRange error. If not specified,string_input_producer
can cycle through the strings instring_tensor
an unlimited number of times.shuffle
: Boolean. If true, the strings are randomly shuffled within each epoch.seed
: An integer (optional). Seed used if shuffle == True.capacity
: An integer. Sets the queue capacity.name
: A name for the operations (optional).
Returns:
A queue with the output strings. A QueueRunner for the Queue is added to the current Graph's QUEUE_RUNNER collection.
Batching at the end of an input pipeline
These functions add a queue to the graph to assemble a batch of examples, with possible shuffling. They also add a QueueRunner
for running the subgraph that fills that queue.
Use batch or batch_join for batching examples that have already been well shuffled. Use shuffle_batch or shuffle_batch_join for examples that would benefit from additional shuffling.
Use batch or shuffle_batch if you want a single thread producing examples to batch, or if you have a single subgraph producing examples but you want to run it in N threads (where you increase N until it can keep the queue full). Use batch_join or shuffle_batch_join if you have N different subgraphs producing examples to batch and you want them run by N threads.
tf.train.batch(tensor_list, batch_size, num_threads=1, capacity=32, enqueue_many=False, shapes=None, name=None)
Creates batches of tensors in tensor_list
.
This function is implemented using a queue. A QueueRunner
for the queue is added to the current Graph
's QUEUE_RUNNER
collection.
If enqueue_many
is False
, tensor_list
is assumed to represent a single example. An input tensor with shape [x, y, z]
will be output as a tensor with shape [batch_size, x, y, z]
.
If enqueue_many
is True
, tensor_list
is assumed to represent a batch of examples, where the first dimension is indexed by example, and all members of tensor_list
should have the same size in the first dimension. If an input tensor has shape [*, x, y, z]
, the output will have shape [batch_size, x, y, z]
. The capacity
argument controls the how long the prefetching is allowed to grow the queues.
Args:
tensor_list
: The list of tensors to enqueue.batch_size
: The new batch size pulled from the queue.num_threads
: The number of threads enqueuingtensor_list
.capacity
: An integer. The maximum number of elements in the queue.enqueue_many
: Whether each tensor intensor_list
is a single example.shapes
: (Optional) The shapes for each example. Defaults to the inferred shapes fortensor_list
.name
: (Optional) A name for the operations.
Returns:
A list of tensors with the same number and types as tensor_list
.
tf.train.batch_join(tensor_list_list, batch_size, capacity=32, enqueue_many=False, shapes=None, name=None)
Runs a list of tensors to fill a queue to create batches of examples.
Enqueues a different list of tensors in different threads. Implemented using a queue -- a QueueRunner
for the queue is added to the current Graph
's QUEUE_RUNNER
collection.
len(tensor_list_list)
threads will be started, with thread i
enqueuing the tensors from tensor_list_list[i]
. tensor_list_list[i1][j]
must match tensor_list_list[i2][j]
in type and shape, except in the first dimension if enqueue_many
is true.
If enqueue_many
is False
, each tensor_list_list[i]
is assumed to represent a single example. An input tensor x
will be output as a tensor with shape [batch_size] + x.shape
.
If enqueue_many
is True
, tensor_list_list[i]
is assumed to represent a batch of examples, where the first dimension is indexed by example, and all members of tensor_list_list[i]
should have the same size in the first dimension. The slices of any input tensor x
are treated as examples, and the output tensors will have shape [batch_size] + x.shape[1:]
.
The capacity
argument controls the how long the prefetching is allowed to grow the queues.
Args:
tensor_list_list
: A list of tuples of tensors to enqueue.batch_size
: An integer. The new batch size pulled from the queue.capacity
: An integer. The maximum number of elements in the queue.enqueue_many
: Whether each tensor intensor_list_list
is a single example.shapes
: (Optional) The shapes for each example. Defaults to the inferred shapes fortensor_list_list[i]
.name
: (Optional) A name for the operations.
Returns:
A list of tensors with the same number and types as tensor_list_list[i]
.
tf.train.shuffle_batch(tensor_list, batch_size, capacity, min_after_dequeue, num_threads=1, seed=None, enqueue_many=False, shapes=None, name=None)
Creates batches by randomly shuffling tensors.
This function adds the following to the current Graph
:
- A shuffling queue into which tensors from
tensor_list
are enqueued. - A
dequeue_many
operation to create batches from the queue. - A
QueueRunner
toQUEUE_RUNNER
collection, to enqueue the tensors fromtensor_list
.
If enqueue_many
is False
, tensor_list
is assumed to represent a single example. An input tensor with shape [x, y, z]
will be output as a tensor with shape [batch_size, x, y, z]
.
If enqueue_many
is True
, tensor_list
is assumed to represent a batch of examples, where the first dimension is indexed by example, and all members of tensor_list
should have the same size in the first dimension. If an input tensor has shape [*, x, y, z]
, the output will have shape [batch_size, x, y, z]
.
The capacity
argument controls the how long the prefetching is allowed to grow the queues.
For example:
# Creates batches of 32 images and 32 labels.
image_batch, label_batch = tf.train.shuffle_batch(
[single_image, single_label],
batch_size=32,
num_threads=4,
capacity=50000,
min_after_dequeue=10000)
Args:
tensor_list
: The list of tensors to enqueue.batch_size
: The new batch size pulled from the queue.capacity
: An integer. The maximum number of elements in the queue.min_after_dequeue
: Minimum number elements in the queue after a dequeue, used to ensure a level of mixing of elements.num_threads
: The number of threads enqueuingtensor_list
.seed
: Seed for the random shuffling within the queue.enqueue_many
: Whether each tensor intensor_list
is a single example.shapes
: (Optional) The shapes for each example. Defaults to the inferred shapes fortensor_list
.name
: (Optional) A name for the operations.
Returns:
A list of tensors with the same number and types as tensor_list
.
tf.train.shuffle_batch_join(tensor_list_list, batch_size, capacity, min_after_dequeue, seed=None, enqueue_many=False, shapes=None, name=None)
Create batches by randomly shuffling tensors.
This version enqueues a different list of tensors in different threads. It adds the following to the current Graph
:
- A shuffling queue into which tensors from
tensor_list_list
are enqueued. - A
dequeue_many
operation to create batches from the queue. - A
QueueRunner
toQUEUE_RUNNER
collection, to enqueue the tensors fromtensor_list_list
.
len(tensor_list_list)
threads will be started, with thread i
enqueuing the tensors from tensor_list_list[i]
. tensor_list_list[i1][j]
must match tensor_list_list[i2][j]
in type and shape, except in the first dimension if enqueue_many
is true.
If enqueue_many
is False
, each tensor_list_list[i]
is assumed to represent a single example. An input tensor with shape [x, y, z]
will be output as a tensor with shape [batch_size, x, y, z]
.
If enqueue_many
is True
, tensor_list_list[i]
is assumed to represent a batch of examples, where the first dimension is indexed by example, and all members of tensor_list_list[i]
should have the same size in the first dimension. If an input tensor has shape [*, x, y, z]
, the output will have shape [batch_size, x, y, z]
.
The capacity
argument controls the how long the prefetching is allowed to grow the queues.
Args:
tensor_list_list
: A list of tuples of tensors to enqueue.batch_size
: An integer. The new batch size pulled from the queue.capacity
: An integer. The maximum number of elements in the queue.min_after_dequeue
: Minimum number elements in the queue after a dequeue, used to ensure a level of mixing of elements.seed
: Seed for the random shuffling within the queue.enqueue_many
: Whether each tensor intensor_list_list
is a single example.shapes
: (Optional) The shapes for each example. Defaults to the inferred shapes fortensor_list_list[i]
.name
: (Optional) A name for the operations.
Returns:
A list of tensors with the same number and types as tensor_list_list[i]
.