mindspore.mindrecord
Introduction to mindrecord:
Mindrecord is a module to implement reading, writing, search andconverting for MindSpore format dataset. Users could load(modify)mindrecord data through FileReader(FileWriter). Users could alsoconvert other format dataset to mindrecord data throughcorresponding sub-module.
- class
mindspore.mindrecord.
FileWriter
(file_name, shard_num=1)[source] Class to write user defined raw data into MindRecord File series.
- Parameters
Raises
ParamValueError – If file_name or shard_num is invalid.
addindex
(_index_fields)[source]Select index fields from schema to accelerate reading.
addschema
(_content, desc=None)[source]Returns a schema id if added schema successfully, or raise exception.
commit
()[source]Flush data to disk and generate the correspond db files.
- Returns
MSRStatus, SUCCESS or FAILED.
Raises
MRMOpenError – If failed to open MindRecord File.
MRMSetHeaderError – If failed to set header.
MRMIndexGeneratorError – If failed to create index generator.
MRMGenerateIndexError – If failed to write to database.
MRMCommitError – If failed to flush data to disk.
classmethod
openfor_append
(_file_name)[source]Open MindRecord file and get ready to append data.
- Parameters
file_name (str) – String of MindRecord file name.
Returns
Instance of FileWriter.
Raises
ParamValueError – If file_name is invalid.
FileNameError – If path contains invalid character.
MRMOpenError – If failed to open MindRecord File.
MRMOpenForAppendError – If failed to open file for appending data.
setheader_size
(_header_size)[source]Set the size of header.
- Parameters
header_size (int) – Size of header, between 16KB and 128MB.
Returns
MSRStatus, SUCCESS or FAILED.
Raises
- MRMInvalidHeaderSizeError – If failed to set header size.
setpage_size
(_page_size)[source]Set the size of Page.
- Parameters
page_size (int) – Size of page, between 32KB and 256MB.
Returns
MSRStatus, SUCCESS or FAILED.
Raises
- MRMInvalidPageSizeError – If failed to set page size.
writeraw_data
(_raw_data, validate=True)[source]Write raw data and generate sequential pair of MindRecord File.
- Parameters
Raises
ParamTypeError – If index field is invalid.
MRMOpenError – If failed to open MindRecord File.
MRMValidateDataError – If data does not match blob fields.
MRMSetHeaderError – If failed to set header.
MRMWriteDatasetError – If failed to write dataset.
- class
mindspore.mindrecord.
FileReader
(file_name, num_consumer=4, columns=None, operator=None)[source] Class to read MindRecord File series.
- Parameters
file_name (str) – File name of MindRecord File.
num_consumer (int, __optional) – Number of consumer threads which load data to memory (default=4).It should not be smaller than 1 or larger than the number of CPU.
columns (list[str], optional) – List of fields which correspond data would be read (default=None).
operator (int, __optional) – Reserved parameter for operators (default=None).
Raises
ParamValueError – If file_name, num_consumer or columns is invalid.
close
()[source]Stop reader worker and close File.
finish
()[source]Stop reader worker.
- Raises
- MRMFinishError – If failed to finish worker threads.
get_next
()[source]Yield a batch of data according to columns at a time.
- Yields
dict – keys is the same as columns.
Raises
- MRMUnsupportedSchemaError – If schema is invalid.
- class
mindspore.mindrecord.
MindPage
(file_name, num_consumer=4)[source] Class to read MindRecord File series in pagination.
- Parameters
Raises
ParamValueError – If file_name, num_consumer or columns is invalid.
MRMInitSegmentError – If failed to initialize ShardSegment.
Return candidate category fields.
- Returns
- list[str], by which data could be grouped.
Getter function for category field
get_category_fields
()[source]Return candidate category fields.
readat_page_by_id
(_category_id, page, num_row)[source]Query by category id in pagination.
- Parameters
Returns
List, list[dict].
Raises
ParamValueError – If any parameter is invalid.
MRMFetchDataError – If failed to read by category id.
MRMUnsupportedSchemaError – If schema is invalid.
readat_page_by_name
(_category_name, page, num_row)[source]Query by category name in pagination.
read_category_info
()[source]Return category information when data is grouped by indicated category field.
- Returns
str, description of group information.
Raises
- MRMReadCategoryInfoError – If failed to read category information.
setcategory_field
(_category_field)[source]- Set category field for reading.
Note
Should be a candidate category field.
- Parameters
-
category_field (str) – String of category field name.
- Returns
-
MSRStatus, SUCCESS or FAILED.
- class
mindspore.mindrecord.
Cifar10ToMR
(source, destination)[source] Class is for transformation from cifar10 to MindRecord.
- Parameters
Raises
ValueError – If source or destination is invalid.
transform
(fields=None)[source]Executes transformation from cifar10 to MindRecord.
- class
mindspore.mindrecord.
Cifar100ToMR
(source, destination)[source] Class is for transformation from cifar100 to MindRecord.
- Parameters
Raises
ValueError – If source or destination is invalid.
transform
(fields=None)[source]Executes transformation from cifar100 to MindRecord.
- class
mindspore.mindrecord.
ImageNetToMR
(map_file, image_dir, destination, partition_number=1)[source] Class is for transformation from imagenet to MindRecord.
- Parameters
- map_file (str) –
the map file which indicate label.the map file content should like this:
- Copyn02119789 1 pen
- n02100735 2 notebook
- n02110185 3 mouse
- n02096294 4 orange
-
image_dir (str) – image directory contains n02119789, n02100735, n02110185, n02096294 dir.
-
destination (str) – the MindRecord file path to transform into.
-
partition_number (int, __optional) – partition size (default=1).
- Raises
ValueError – If map_file, image_dir or destination is invalid.
transform
()[source]Executes transformation from imagenet to MindRecord.
- Returns
- SUCCESS/FAILED, whether successfully written into MindRecord.
- class
mindspore.mindrecord.
MnistToMR
(source, destination, partition_number=1)[source] Class is for transformation from Mnist to MindRecord.
- Parameters
Raises
ValueError – If source/destination/partition_number is invalid.
transform
()[source]Executes transformation from Mnist to MindRecord.
- Returns
- SUCCESS/FAILED, whether successfully written into MindRecord.