sdthdf#
Purpose
Description
sdthdf handles MATLAB data/metadata information. Its main purpose if to deal efficiently with the binary MATLAB file format .mat that is based on the HDF file format.
The new hdf5 file format, supported by MATLAB since version 7.3, allows very efficient data access from files. Partial loading is possible, as well as data location by pointers. sdthdf allows the user to unload RAM by saving specific data to dedicated files, and to optimize file loading using pointers. To be able to use these functionalities, the file must have been saved in hdf5 format, which is activated in MATLAB using the -v7.3 option of the save function.
File handling commands based on HDF5
The following commands are supported.
hdfReadRef#
This command handles partial data loading, depending on the level specified by the user.
For unloaded data, a v_handle pointer respecting the data structure and names is generated, so that the access is preserved. Further hdfreadref application to this specific data can be done later.
By default, the full file is loaded. Command option -level allows specifying the desired loading level. For structured data, layers are organized in which substructures are leveled. This command allows data loading until a given layer. Most common levels used are given in the following list
- -level0 Load only the data structure using pointers.
- -level1 Load the data structure and fully load fields not contained in substructures.
- -level2 Load the data structure, and fully load fields including the ones contained in the main data substructures
- -level100 Load the data structure, and fully load all fields (Until level 100, which is generally sufficient).
It takes in argument either a file, or a data structure containing hdf5 v_handle pointers. In the case where a file is specified, the user can precise the data to be loaded, by giving its named preceded by a slash /, substructure names can also be specified giving the name path to the variable to be loaded with a succession of slashes.
% To load an hdf5 file r1=sdthdf('hdfreadref','my_file.mat'); % To load it using v_handle pointers r1=sdthdf('hdfreadref-level0','my_file.mat'); % To load a specified variable r2=sdthdf('hdfreadref-level0','my_file.mat','/var2'); % To load a specified sub data r3=sdthdf('hdfreadref-level1','my_file.mat','/var2/subvar1'); % To load a subdata from a previously loaded pointer r4=sdthdf('hdfreadref',r2.subvar1);
hdfdbsave#
This command handles partial data saving to a temporary file. It is designed to unload large numerical data, such as sparse matrices, or deformation fields. Command option -struct however allows to save more complex data structures.
The function takes in argument the data to save and a structure with a field Dbfile containing the temporary file path (string). The function outputs the v_handle to the saved data. The v_handle has the same data structure than the original. The v_handle data can be recovered by hdfreadref.
opt.Dbfile=nas2up('tempname_DB.mat'); r1=sdthdf('hdfdbsave',r1,opt); r2=sdthdf('hdfdbsave-struct',r2,opt);
hdfmodelsave#
This command handles similar saving strategy than hdfdbsave but is designed to integrate feplotmodels in hdf5 format. The file linked to the model is not supposed to be temporary, and data names are linked to an SDT model data structure, which are typically in the model stack. The variable data names, must be of format field_name to store model.field in hdf5 format.
For model stack entries, the name must be of the type Stack_type_name to store cf.Stack{'type','name'}.
The function takes in argument the data base file, the feplot handle and the data name, which will be interpreted to be found in the feplotmodel. The data will be replaced by v_handle pointers in the feplotmodel. Data can be reloaded with command hdfmodel
sdthdf('hdfmodelsave','my_file.mat',cf,'Stack_type_name');
hdfmodel#
This command loads v_handle data pointers in the feplotmodel at locations where hdf5 data have been saved. This command works from the hdf file side, and loads all the data contained with standard names in the feplotmodel. See hdfmodelsave for more information on the standard data names. Commando option -check only loads the data contained in the hdf file that is already instanced in the feplotmodel.
sdthdf('hdfmodel','my_file.mat',cf);
hdfclose#
Handling hdf5 files in data structures can become very complex when multiple handles are generated in multiple data. This command thus aims to force a file to be closed.
sdthdf('hdfclose','my_file.mat');
A lower level closing call allows clearing the hdf5 libraries, when needed,
sdthdf('hdfH5close')
Here is an example of offload to HDF5 based mat files, and how to access the data afterwards.
fname=fullfile(sdtdef('tempdir'),'ubeam_Stack_SE.mat'); fname2=fullfile(sdtdef('tempdir'),'ubeam_model.mat'); model=demosdt('demoubeam');cf=feplot; cf.mdl=fe_case(cf.mdl,'assemble -matdes 2 1 NoT -SE'); cf.Stack{'curve','defR'}=fe_eig(cf.mdl,[5 50 1e3]);% save(off-load) some stack entries to a file sdthdf('hdfmodelsave',fname,cf,'Stack_curve_defR') % save model but not the off-loaded entries fecom('save',fname2);
cf=fecom('load',fname2); % reload the model sdthdf('hdfmodel',fname,cf); % reload pointers to the entries cf.Stack{'defR'}
For MATLAB >7.3 HDF based .mat files, you can open a v_handle pointer to a variable in the file using
fname=fullfile(sdtdef('tempdir'),'ubeam_Stack_SE.mat'); var=sdthdf('hdfreadref -level0',fname,'Stack_curve_defR')
ioClearCache,ioLoad, ...#
io commands are meant to allow I/O operations tailored to memory demanding operations.
sdthdf('ioFreeCache','fname') or sdthdf('ioFreeCache','_vhandlename') free the cache of a given file or the file associated with a specific v_handle.
sdthdf('ioLoadVarName','fname') loads VarName from file fname and frees the associated cache. This operation still requires memory to store the variable and the file cache and may thus fail for large variables.
sdthdf('ioBufReadVarName','fname') will load VarName from file fname while controlling the cache used. This is only intended for large data sets written to file as contiguous uncompressed data.
MATLAB data handling utilities
compare#
The compare command checks the data equivalence of two MATLAB variables. This is an efficient utility to spot local differences in large or complex data.
Any data compound can be input, mixing any native MATLAB classes. The compare command will then recursively check the equivalence of the data compound structure and content.Its output will be a cell array with as many lines as differences were found. The cell array output is empty if all fields were found equal.
% Comparing two sets of data compounds r1=struct('data1',{{speye(15)}},'data2',rand(15,1)); r2=struct('data1',{{speye(14)}},'data2',rand(15,1),... 'data3',1); sdthdf('compare',r1,r2)
pointerList[sortm,-mb]#
The pointerList command outputs the internal memory address of each variable, (expanded for structures and cell arrays) specified in input and provides a statistic on the total amount of data pointed in memory versus the total memory allocated to the storage. As MATLAB performs lazy variable copy, copied variables share the same pointed memory data until one of the instances is modified, the traditional output of the who command may thus be inappropriate to assess memory usage. The following command options allow output variations
- sortm sorts the output in increasing memory, so that the user sees the largest memory usage at the bottom of the command window.
- -mb converts the memory sizes outputs from Bytes to Megabytes.
If not output is specified, the statistics are directly printed on screen, else a cell array with as many lines as found variables is output, and three columns. First column is the variable name, second is the memory address, third is the memory size.
The input is required to be a structure, cell array, v_handle object or a string containing whos. In the latter case, a reformatting of the output of the whos command is performed.
% Getting information on data sizes in memory % Generate a sample data structure r1=struct('data1',speye(12),'data2',rand(15,1)); r1.data3=r1.data1; % lazy copy% reformat the output of whos sdthdf('pointerlistsortm','whos')
% Get memory information on r1 sdthdf('pointerlistsortm',r1)
See also