MIRA
|
Previous: Class Factory | Next: Concepts |
---|
This document acts as a manual. For further implementation details see Serialization Framework (Implementation Details).
For information on serialization format changes, see Serialization format changes.
Serialization is the process of converting a data structure or object into a sequence of bits so that it can be stored in a file or memory buffer, or transmitted across a network connection link to be "resurrected" (deserialized) later in the same or another computer environment. The serialization framework that is implemented by MIRA achieves this in a very generic way and additionally extends the C++ language by a very basic "Reflection" concept.
"Reflection" is also known from higher level programming languages of the 3rd generation like Java and C#. It allows to retrieve information on the structures of the program at runtime, e.g. to query the names and types of variables and methods of classes at runtime.
The provided serialization framework supports:
These capabilities are making the serialization framework an important technique that is used for several concepts within the MIRA framework:
The serialization framework can serialize and deserialize built-in fundamental C/C++ types and essential STL types like strings natively, i.e. without including any additional header file:
Type | Remarks |
---|---|
fundamental types: e.g. char, int, uint16, float, bool, ... | native support |
arrays: e.g. int array[10] | native support |
enums | native support |
std::string | native support |
user-defined classes | need to implement a reflect method |
Other user-defined types and classes need to implement a "reflect()" method (see Serialization of user-defined types). Most MIRA classes already provide such a method. For classes of external libraries such as STL, boost, etc. several adapters are provided. Those classes can be serialized by simply including the corresponding header:
Type | MIRA header to include |
---|---|
std::list<> | #include <serialization/adapters/std/list> |
std::vector<> | #include <serialization/adapters/std/vector> |
std::map<> | #include <serialization/adapters/std/map> |
std::multimap<> | #include <serialization/adapters/std/map> |
std::deque<> | #include <serialization/adapters/std/deque> |
std::set<> | #include <serialization/adapters/std/set> |
std::multiset<> | #include <serialization/adapters/std/set> |
std::pair<> | #include <serialization/adapters/std/pair> |
std::shared_ptr<> | #include <serialization/adapters/std/shared_ptr.hpp> |
boost::shared_ptr<> | #include <serialization/adapters/boost/shared_ptr.hpp> |
boost::array<> | #include <serialization/adapters/boost/array.hpp> |
boost::multi_array<> | #include <serialization/adapters/boost/multi_array.hpp> |
boost::optional<> | #include <serialization/adapters/boost/optional.hpp> |
boost::tuple<> | #include <serialization/adapters/boost/tuple.hpp> |
boost::variant<> | #include <serialization/adapters/boost/variant.hpp> |
Eigen::Matrix | #include <serialization/adapters/Eigen/Eigen> |
cv::Size | #include <serialization/adapters/opencv2/core/core.hpp> |
cv::Rect | #include <serialization/adapters/opencv2/core/core.hpp> |
cv::Mat | #include <serialization/adapters/opencv2/core/core.hpp> |
Please be careful with the serialization of platform dependent types, like: size_t, std::size_t and derived types std::streamoff, std::streampos, etc. These types express sizes of memory blocks or positions in those buffers and therefore are different on 32bit and 64bit systems. The problem becomes apparent when the type is serialized as binary content, since data serialized on 32bit system will not be compatible to those generated on 64bit systems. Use uint64 or uint32 instead, which specify the size explicitly. |
The actual serialization and deserialization of a value or object is performed by Serializers and Deserializers, respectively. There are different Serializers and Deserializers each of which is able to serialize the data in different formats:
Serializers provide the serialize() method to serialize the value or object into their provided format:
Beside the object that is serialized, a name of that value and a description have to be specified. The name is used to identify the value in the serialized data. The description should give the meaning of the value in detail, and is used by serializers in different ways, e.g. it can be stored as comment with the respective data in an XML document (by the XMLSerializer), or for properties it can be shown to the user in a property editor.
The following example shows how to serialize an STL vector into an XML file:
The example will generate the following xml output:
For deserialization the Deserializers provide the deserialize() method:
This method again takes the name of the value that should be deserialized and a reference to the object where the content should be deserialized.
The following example deserializes an STL vector from an XML file:
The usage of the other serializers is similar. For examples of their usage please refer to the documentation of XMLSerializer, JSONSerializer, BinarySerializer and the corresponding deserializers.
In order to use all above mentioned features for your own classes, all you need to do is to add a special reflect()
method, that exposes all important members and methods of your class to the serialization and reflection framework. There are two ways of making a class "reflectable": either via an intrusive (modifying the class) or a non-intrusive (not modifying the class) reflect method.
To make your class "reflectable", the reflect method must be a member of your class:
In general, names for reflected elements can be chosen freely, as long as they are distinct and use the set of characters that are valid in serialization formats like XML and JSON (e.g. sticking to characters valid for C++ identifiers is a good idea). In addition, '.' should not be used in element names: the '.' character is used as concatenator to designate nested elements (such as MyMember.MyMembersMember) in some contexts. Including '.' within an element's name will prevent certain access/query functionality from working correctly. |
Please note that the reflect()
method must exist for each type of reflector as parameter. The easiest and most common way to achieve that is to provide a template method with the following declaration:
If you use Eclipse for software development, you can also use the reflect-method template that is provided within the MIRA code templates by typing reflect
and pressing Ctrl+Space.
(In some very special cases, reflect() should behave differently for a specific reflector, this can be achieved by overloading reflect() methods for particular reflector parameter types.)
The reflect()
method will be invoked each time an object of that class is serialized/deserialized. The serializer/deserializer object (the "reflector") is passed as parameter to the method. For each member you want to serialize, you must call the member
method of the reflector to specify the name of the member, the member variable and a comment that describes the member. The member can be of every type that is serializable itself, i.e fundamental types like float, etc., STL containers, or instances of complex classes that contain a reflect
method themselves. In the latter case this process will continue recursively, calling the reflect
method of that class until all the data contained in the class is serialized/deserialized.
When subclassing, the { ... template<typename Reflector> void reflect(Reflector& r) { // call base class reflect r.member("Member", mA, "Comment on A"); } ... }; |
Sometimes you can not alter the code of a class, e.g. types provided by an external library. Therefore MIRA provides a way to make the class "reflectable" in a non-intrusive way. You just have to add the reflect method as a global function:
Again, the reflect()
method must have overloads for each type of reflector or be a template method as shown above.
Note also that the members must be public accessible for the above example to work. However, if the members are protected and the class provides getter and setter methods, you can use these to reflect the members. See Getters and Setters.
The serialization framework also supports the serialization of pointers and smart pointers (boost::shared_ptr, std::shared_ptr). When serializing a pointer, it is not sufficient to store the value of the pointer, rather the object it points to must be saved. When the member is loaded later, a new object is created and a new pointer to the object is loaded into the class member.
If the same pointer (pointing at the same object address) is serialized more than once within one object, only one instance is added to the serialized data. When deserialized, data is read back in only for the first pointer, the second (and further) pointer is set to point to the same address as the first one. To do so, all stored objects are tracked by the serialization framework. If you try to serialize a pointer to a previously serialized object, the framework will store a reference to the previously stored object instead of storing the content of the object again. In order to reference other objects, each object has a unique id, that is formed using the object's name and the names of its parent objects separated by a ".".
The following class:
will be serialized using the XMLSerializer as:
Note, that the pointers "ptr2" and "ptr3", pointing to values already stored before, use references instead of storing the values twice.
If you deserialize a normal pointer, the object the pointer points to will be created by the serialization framework using the MyObject* obj = NULL; // deserialize the "pointer": a new object will be created and a pointer // to that object is stored in "obj": deserializer.deserialize("myObject", obj); ... // make sure to delete obj, if you do not need it any longer delete obj; Like in many other cases, it is safer to use smart pointers instead: std::shared_ptr<MyObject> obj; // deserialize the "pointer" deserializer.deserialize("myObject", obj); // object will be freed automatically by the smart pointer |
Special care must be taken when serializing pointers to base classes of polymorphic types, since the pointer may point to one of several possible concrete derived classes. So when the pointer is saved, the class name must be saved, too.
When the pointer is deserialized, the class name is read and an instance of the corresponding class is constructed using the class factory. Finally, the data can be loaded to the newly created instance of the correct type.
Since the serialization framework works closely together with the class factory, when deserializing polymorphic classes, your polymorphic classes must be instantiable by the class factory. Hence, if you want to serialize and deserialize polymorphic classes, these classes must be derived from Object and must contain a MIRA_OBJECT macro. Moreover, these classes must be registered in the class factory and the serialization framework using the MIRA_CLASS_SERIALIZATION as shown in these examples:
In the following XML file the class name of a polymorphic object instance is specified:
When the object is deserialized from the above XML file, an object of the class "MyClass2" will be created automatically and the pointer to that class is stored in the pointer "object" which is of the type MyBaseClass*:
Please note that the MIRA_CLASS_SERIALIZATION macro usually needs to be placed within the source file (instead of the class header), to make sure the registration code is instantiated only once. |
Versioning of classes is optional, but can be used to maintain backward compatibility when changes in the serialized members are necessary (adding additional members, removing members, changing the name or order of the members, etc).
If multiple versions have existed in the past, but only a certain version is supported now, you can add a call to requireVersion()
to specify a certain version in the reflect()
method:
This specifies the current version is 3, and only this version can be used.
When serialized by an XMLSerializer, the output will look like this:
On deserialization, requireVersion()
will throw an exception if the available version differs from the required version when deserializing the class.
If you want to support different versions, you can use version()
instead of requireVersion()
.
While deserializing the object, version()
will return the available class version that is stored in the serialized data (XML file, etc). Afterwards you can deserialize the specific members depending on the version as in the example above. While serializing an object, version()
will by default return the version that was specified as parameter. However, the reflector can be configured to serialize a specific version, and the reflect()
method should support that. See Serializing to a Specific Version below.
Macros MIRA_REFLECT_VERSION and MIRA_REFLECT_REQUIRE_VERSION can be used instead of version()
/requireVersion()
:
Do not specify your class to have version 0, always start with version 1 or higher (0 is used as a dummy version value by various reflectors for objects not providing version information).
Historically, classes have just been defining their current version themselves during serialization (the case where an object exists and its state is read out and serialized), by calling Reflector::version() with the respective version number parameter. Different versions of a class were only considered during deserialization (i.e. restoring an object state from serialized data). In some cases it may be desirable, however, to serialize a different version (naturally, this can not be a higher version than the class implementation knows, only lower). This is useful e.g. to ensure serializing data that is compatible with a certain other (older) implementation, thus it can be deserialized by another instance.
For such cases, a mechanism is provided to generally enable requesting a specific version per class from the reflector, through the method Serializer::desireClassVersions(). This method takes as parameter a map of class type (type name) to version number. When a class contained in the map calls Reflector::version() in its reflect() method, the call shall return the version number from the configured version map instead of the version number indicated by the class implementation itself. The class serialization must then follow that returned version to create compatible serialized data (just as it would read data according to the actual version number during deserialization).
However, this is a late addition to the serialization framework, and many classes already exist(ed) that assume only the current version is ever needed during serialization and only support that one version properly (this is e.g. common when reflection is split into reflectRead() and reflectWrite() methods, as described in Advanced Techniques). In order to make sure the reflector does not assume one version number but the class reflection ignores it and implements another (ending up with inconsistent serialized data), the serializer must know whether the class actually supports the desired version it will return (if different than the one provided as the parameter by the class). To this purpose, variants of Reflector::version()/requireVersion() have been added with an additional AcceptDesiredVersion parameter. These, when called, include the implicit contract that the class will accept and properly implement any returned version (<= current version, and >= minVersion in case of requireVersion). When the class calls the 'traditional' Reflector::version()/requireVersion() methods instead, however (as all previously existing implementations do until updated), the serializer will still check if a different version is desired, but will issue a warning in this case and go on to return the version number the reflected class has provided as parameter.
When a class is declared inheriting from a base class, it may happen that both the base and the subclass independently undergo changes over time and different versions exist for both. In that case, it is possible to independently declare a version in each of the reflect()
methods.
This will serialize e.g. to XML like this:
Here, different versions are assigned to different parts of the same object (which are reflected in separate parts of code), distinguished by type (type name). In order to tell the reflector which type the version refers to, version<Type>()
is a template method that is called either using an explicit type template parameter, or with a pointer to the object as additional parameter (employing automatic type deduction by the compiler). In intrusive reflect()
, a this
pointer can just be used as additional parameter, as seen in the examples above. In non-intrusive reflection, the first form is more common:
On the other hand, not all serializers store the type name in serialized data to distinguish between versions (e.g. the BinarySerializer does not store any meta data). For these, it is very important to not just call the base class' reflect()
directly, but use reflectBase()
or MIRA_REFLECT_BASE
/MIRA_REFLECT_BASE_NONINTRUSIVE
to make sure the serializer can separate these portions of reflection and understand they (at least potentially) use own version numbers. This is the case even if version()
is not used in one or both parts. (Not yet! Someone might want to add it in later versions of those classes!)
Instead of using versioning, using default values often is sufficient to maintain backward compatibility when new members are added to classes. Default values can be specified as optional parameter of the member()
method:
In the above example mI will be set to the default value 123 if the XML file does not contain the member "i". Additionally, a warning will be printed via the error logging framework. If no default value was specified instead, deserialzing the above object would result in an exception if the member "i" is missing.
Default values that are specified within a class' reflect() method, can also be used to initialize the corresponding members within the constructor. Therefore, a special "DefaultInitializer" reflector is provided which visits the reflect method and initializes all members with the specified default values. To simplify this process even more, you can use the MIRA_INITIALIZE_THIS macro as shown in the following example:
If your reflection contains setters or notifiers, MIRA_INITIALIZE_THIS executes them. As with any call from within the constructor, be careful if you end up calling virtual functions (in particular not to try calling a pure virtual function). |
Instead of using a default value, you can also specify serialization::IgnoreMissing as last parameter:
This will neither produce an exception, nor set a default value if the parameter "Value" is missing. Instead, the parameter is ignored and its value is not changed at all. This behavior is useful, if the value was set correctly before (e.g. in the constructor) and should not be altered if it is not specified in the configuration file.
Instead of using the variable of a member in the reflect
method you can specify getter and setter methods the serializers and deserializers should use to access the member. This is useful when additional values or look-up-tables need to be computed after a certain member is deserialized or for converting the values of members before they are serialized and deserialized (e.g. for converting the angle from rad to deg in getAngle() before storing it and for converting it back in setAngle() after restoring it in the example below).
Properties are parameters that can be changed at runtime via a property editor. There are two kinds of properties - read/write properties and read-only properties. They support the same syntax as members, but additionally they provide mechanisms to specify hints like limits or enumerations. Read-only properties also can not have setters. Let's start with a simple example.
For a graphical property editor it can be useful to specify limits for a property in order to limit input ranges for used editors like spinboxes or sliders. Therefore property hints are used.
For some editors like sliders or spinboxes it can be useful to specify steps for changing the value. e.g. for a property that should be incremented/decremented in steps of 10 one could write
It is even possible to combine these hints in order to allow specifying limits and steps at once:
To be able to choose the right editor widget for the property one can specify the type of the property.
For convenience there are already two hints for sliders and spin boxes defined:
Some properties allow the user to select from a given set of options. This is called an enumeration and the graphical editor will display a combobox for these properties.
In some cases it is desirable to just make the value of a member also observable at runtime as a read-only property. Instead of calling both member() and roproperty() (with the same or related name), this can simply be achieved by using REFLECT_CTRLFLAG_MEMBER_AS_ROPROPERTY
on the call to member, avoiding code duplication. However, PropertyHints can not be specified this way. Also when adding this flag to an existing member, be aware that (read-only) properties may require more caution than just members (see documentation of mira::ReflectCtrlFlags for some related aspects).
Setters offer a powerful mechanism to handle a changed value of a member or property in different ways. However, in some cases you just want to get notified whenever the value of one or more properties is changed. This usually is the case when writing visualization classes. These classes usually have a large number of properties that control the appearance. When such a property is set, usually no special setter shall be called, but the visualisation should be notified to redraw itself in order to visualize the changes immediately. For this purpose, the setterNotify() method is provided. It can be used to create a predefined setter that takes the member whose value should be set and a user defined callback function that is called, whenever the value changes:
As you can see in the above example, the setterNotify() method takes the member, whose value should be set, as first parameter and the notification function as second parameter. The latter one, can be a member function (as in the first line) or a function binded using boost::bind (the second line),
For details on how to properly handle non-static properties (i.e., not the properties' values, but the set of properties themselves can change), see Dynamic properties.
Support for STL containers: vector, list, deque, set, multiset:
Serialized content in XML format:
Serialized content in JSON format:
Support for map, multimap:
Serialized content in XML format:
Serialized content in JSON format:
This section is for advanced users that are familar with the usage of the serialization framework.
Imagine you have the following class:
In XML an instance of Foo will be serialized as:
In most cases this will be satisfactory. However, sometimes a more convenient form of storage is desired, which avoids the occurence of the additional "Value" tag. Instead the object should be stored as:
In other words, the "Value" should be transparent to the user and the "Foo" class should be serialized as if it was from the underlying type of "Value" (in this example 'std::string').
To achieve this, you need to modify the above example as follows:
Note, that the "member" call in the reflect method was replaced by "delegate" and that a specialization of the IsTransparentSerializable type trait was added.
The specialization of the template class IsTransparentSerializable must be done in the mira namespace. |
You only can make classes "transparent serializable" that contain a SINGLE member only which is serialized. Multiple calls of "delegate" or the combined usage of "delegate", "member" or "property" from the same reflect method is not allowed and results in undefined behavior. |
Delegation can also be used with getters and setters:
It is also possible to make a class transparent serializable only for specific reflectors. This will also require overloads of the reflect() method with different reflector parameters:
Note that the IsTransparentSerializable trait has no actual effect for BinarySerializer/BinaryDeserializer. That is because 'data transparency' is determined by the use of delegate() in the reflect() method. The transparency trait is only informing the serializer to not add a member structure element for the embedded data. Since binary serialized data does not include any such structure information, the transparency trait is meaningless. XMLDeserializer and JSONDeserializer also do not require the type trait, they work just based on delegate(). Thus, in most use cases, it should be sufficient to define IsTransparentSerializable generically (for any reflector type), in rare cases it may be required to specialize it for the XMLSerializer only (when XMLSerializer delegates but JSONSerializer does not, or vice versa, see example above).
Normally a single reflect for serialization and deserialization is used as members are serialized and deserialized in the same way. But sometimes you want to transform a member into something else or serialize it in a different format. In that case different code must be used for reading and writing data from your class. The serialization framework supports this by allowing to split the reflect method in two parts - reflectRead and reflectWrite.
First a macro must be used inside or outside your class depending if you want to define your reflect methods intrusive or non-intrusive (either MIRA_SPLIT_REFLECT_MEMBER or MIRA_SPLIT_REFLECT). After that you need to implement the two methods - reflectRead for serializing your class members and reflectWrite to deserialize your class members.
In the example a uint8 bitfield is used as member but should be reflected bitwise.
If your class is to be serialized via BinarySerializer it is crucial to have the same number, types and order of your members in read and write reflect methods. |
Specializing reflection for different reflector classes by overloading the reflect method has been mentioned a few times above.
Here is the practical example of reflection of XMLDom, with an overload for the XMLSerializer:
Special care must be taken when overloading for BinarySerializer (which actually is a base class for an entire group of specific reflectors), in order to include the MetaSerializer correctly. MetaSerializer is subclassed from BinarySerializer, and since its purpose is to describe the data produced by the BinarySerializer, any special reflect implementation for the BinarySerializer should be applied for the MetaSerializer as well (if the reflected class is meta-serializable at all, i.e. serialized data has a fixed binary layout, which is not the case e.g. for dynamic size matrices, images etc.).
Here is a simple idea, that is problematic though:
The overloaded variant will be applied for BinarySerializers, including MetaSerializer. However, the reflector in the argument is taken as a reference to a base BinarySerializer (even for the MetaSerializer subclass), and since these reflectors do not use runtime polymorphism (virtual methods), any call to r.member, r.property etc. will call BinarySerializer's implementation. In the case of MetaSerializer it would ignore the more specific MetaSerializer::property(). In the generic variant, that problem is avoided by using the exact reflector type as template parameter (if reflection was using runtime polymorphism, the method could just take e.g. an AbstractReflector reference).
The correct solution is to use again the exact reflector type as template parameter, but still provide a special implementation for BinarySerializers and subclasses, e.g. like this:
Pointers that point to pointers can not be serialized. If you try to serialize pointers on pointers you will get the following compiler error:
Instead of serializing the pointer to a pointer you should serialize the pointer that is pointed to. There should never be a need to serialize a pointer to a pointer, if it is, you really should think about your code.
Pointers that point to fundamental types (int, float, etc.) can not be serialized. This restriction is made for performance reasons. If you try to serialize pointers to fundamental types you will get the following compiler error:
If you really need to serialize a pointer to a fundamental type, you must wrap the fundamental type into a class or struct.
When pointers are serialized improperly a so-called pointer conflict may arise as shown in the following example:
In this example, the pointer "ptr" points to the object "obj". Moreover, the pointer is reflected BEFORE the object. Here, the problem occurs. When the pointer "ptr" is serialized the underlying object "obj" was not serialized yet, hence the serialization framework will serialize the whole content of the object. Afterwards, the object "obj" will be serialized. However, the object was already serialized before using the pointer "ptr" and should not be serialized twice. In this case, an XIO exception will be thrown to indicate the problem.
To resolve this conflict one only has to switch the serialization order of the pointer and the object:
Now, the object "obj" will be serialized first. When the pointer "ptr" is serialized afterwards, the underlying object will not be serialized a second time, instead a reference to the previously serialized object will be stored for the pointer and hence there is no conflict here.
Abstract classes can be serialized only, if they are subclassed from mira::Object. Otherwise you will get the following compiler error:
The reason for this restriction is, that objects of abstract classes cannot be created during the process of deserialization. This can be achieved using the class factory only, which will create an object of the derived (non-abstract) class. Hence, if you want to serialize abstract types, they need to be inherited from the mira::Object in order to use the class factory. Note that abstract classes are just a special case of polymorphic classes (identifiable at compile time), and that ALL polymorphic classes need to be derived from mira::Object to work with serialization properly. See Polymorphic Classes for details.
Next: Concepts |
---|