17,841 bytes added
, 05:49, 6 December 2021
Codecs are a serialization tool from mojang's DataFixerUpper library. Codecs are used alongside [[DynamicOps/1.17|DynamicOps]] to allow objects to be serialized to different formats and back, such as JSON or [[Using NBT/1.17|NBT]]. While the DynamicOps describes the format the object is to be serialized to, the Codec describes the manner in which the object is to be serialized; a single Codec can be used to serialize an object to any format for which a DynamicOps exists. Codecs and DynamicOps jointly form an abstraction layer over data serialization, simplifying the effort needed to serialize or deserialize data.
= Using Codecs =
== Serialization and Deserialization ==
The primary use for Codecs is to serialize java objects to some serialized type, such as a JsonElement or a Tag, and to deserialize an serialized object back to its proper java type. This is accomplished with <code>Codec#encodeStart</code> and <code>Codec#parse</code>, respectively. Given a Codec<SomeJavaType> and a DynamicOps<SomeSerializedType>, we can convert instances of SomeJavaType to instances of SomeSerializedType and back.
Each of these methods take a [[DynamicOps/1.17|DynamicOps]] instance and an instance of the object we are serializing or deserializing, and returns a DataResult:
<syntaxhighlight lang="java">
// let someCodec be a Codec<SomeJavaType>
// let someJavaObject be an instance of SomeJavaType
// let someTag and someJsonElement be instances of Tag and JsonElement, respectively
// serialize some java object to Tag
DataResult<Tag> result = someCodec.encodeStart(NBTOps.INSTANCE, someJavaObject);
// deserialize some Tag instance back to a proper java object
DataResult<SomeJavaType> result = someCodec.parse(NBTOps.INSTANCE, someTag );
// serialize some java object to a JsonElement
DataResult<JsonElement> result = someCodec.encodeStart(JsonOps.INSTANCE, someJavaObject);
// deserialize a JsonElement back to a proper java object
DataResult<SomeJavaType> result = someCodec.parse(JsonOps.INSTANCE, someJsonElement);
</syntaxhighlight>
A DataResult either holds the converted instance, or it holds some error data, depending on whether the conversion was successful or not, respectively. There are several things we can do with this DataResult; <code>DataResult#result</code> simply returns an Optional containing the converted object if the conversion was successful, while <code>DataResult#resultOrPartial</code> also runs a given function if the conversion was unsuccessful (in addition to returning the Optional). #resultOrPartial is particularly useful for logging errors during datapack deserialization:
<syntaxhighlight lang="java">
// deserialize something from json
someCodec.parse(JsonOps.INSTANCE, someJsonElement)
.resultOrPartial(errorMessage -> doSomethingIfBadData(errorMessage))
.ifPresent(someJavaObject -> doSomethingIfGoodData(someJavaObject))
</syntaxhighlight>
== Builtin Codecs ==
=== Primitive Codecs ===
The Codec class itself contains static instances of codecs for all supported primitive types, e.g. <code>Codec.STRING</code> is the canonical <code>Codec<String></code> implementation. Primitive codecs include:
* BOOL, which serializes to a boolean value
* BYTE, SHORT, INT, LONG, FLOAT, and DOUBLE, which serialize to numeric values
* STRING, which serializes to a string
* BYTE_BUFFER, INT_STREAM, and LONG_STREAM, which serialize to lists of numbers
* EMPTY, which represents null objects
=== Other Builtin Codecs ===
Vanilla minecraft has many builtin codecs for objects that it frequently serializes. These codecs are typically static instances in the class the codec is serializing; e.g. <code>ResourceLocation.CODEC</code> is the canonical <code>Codec<ResourceLocation></code>, while <code>BlockPos.CODEC</code> is the codec used for serializing a BlockPos.
Each vanilla <code>Registry</code> acts as the Codec for the type of object the registry contains; e.g. <code>Registry.BLOCK</code> is itself a <code>Codec<Block></code>. Forge Registries, however, do not currently implement Codec and cannot yet be used in this way; custom codecs must be created for forge-specific registries that are not tied to specific vanilla registries.
Of particular note here is the CompoundTag.CODEC, which can be used to e.g. serialize a CompoundTag into a json file. This has a notable limitation in that CompoundTag.CODEC *cannot* safely deserialize lists of numbers from json, due to the strong typing of ListTag and the way that the NBTOps deserializer reads numeric values.
= Creating Codecs =
Suppose we have the following class, and we want to deserialize json files to instances of this class:
<syntaxhighlight lang="java">
public class ExampleCodecClass {
private final int someInt;
private final Item item;
private final List<BlockPos> blockPositions;
public ExampleCodecClass(int someInt, Item item, List<BlockPos> blockPositions) {...}
public int getSomeInt() { return this.someInt; }
public Item getItem() { return this.item; }
public List<BlockPos> getBlockPositions() { return this.blockPositions; }
}
</syntaxhighlight>
Where a json file for an instance of this class might look like:
<syntaxhighlight lang="json">
{
"some_int": 42,
"item": "minecraft:gold_ingot",
"block_positions":
[
[0,0,0],
[10,20,-100]
]
}
</syntaxhighlight>
We can assemble a codec for this class by building a new codec out of smaller codecs. We'll need a codec for each of these fields:
* a <code>Codec<Integer></code>
* a <code>Codec<Item></code>
* a <code>Codec<List<BlockPos>></code>
And then we'll need to assemble these into a <code>Codec<ExampleCodecClass></code>.
As previously mentioned, we can use <code>Codec.INT</code> for the integer codec, and <code>Registry.ITEM</code> for the Item codec. We don't have a builtin codec for list-of-blockpos, but we can use BlockPos.CODEC to create one.
== Lists ==
The <code>Codec#listOf</code> instance method can be used to generate a codec for a List from an existing codec:
<syntaxhighlight lang="java">
// BlockPos.CODEC is a Codec<BlockPos>
Codec<List<BlockPos>> = BlockPos.CODEC.listOf();
</syntaxhighlight>
Codecs created via listOf() serialize things to listlike objects, such as [] json arrays or ListTags.
Deserializing a list in this manner produces an ''immutable'' list. If a mutable list is needed, [[Codecs#Equivalent Types and xmap/1.17|xmap]] can be used to convert the list after deserializing.
== Records ==
RecordCodecBuilder is used to generate codecs that serialize instances of classes with explicitly named fields, like our example above. Codecs created via RecordCodecBuilder serialize things to maplike objects, such as {} json objects or CompoundTags.
RecordCodecBuilder can be used in several ways, but the simplest form is as follows:
<syntaxhighlight lang="java">
public static final Codec<SomeJavaClass> = RecordCodecBuilder.create(instance -> instance.group(
someFieldCodecA.fieldOf("field_name_a").forGetter(SomeJavaClass::getFieldA),
someFieldCodecB.fieldOf("field_name_b").forGetter(SomeJavaClass::getFieldB),
someFieldCodecC.fieldOf("field_name_c").forGetter(SomeJavaClass::getFieldC),
// up to 16 fields can be declared here
).apply(instance, SomeJavaClass::new));
</syntaxhighlight>
Where each line in the group specifies a codec instance for the type of that field, the field name in the serialized object, and the corresponding getter function in the java class. The builder is concluded by specifying a constructor or factory for the java class whose arguments are the previously defined fields in the same order.
For example, using RecordCodecBuilder to create a codec for our example class above:
<syntaxhighlight lang="java">
public static final Codec<ExampleCodecClass> = RecordCodecBuilder.create(instance -> instance.group(
Codec.INT.fieldOf("some_int").forGetter(ExampleCodecClass::getSomeInt),
Registry.ITEM.fieldOf("item").forGetter(ExampleCodecClass::getItem),
BlockPos.CODEC.listOf().fieldOf("block_positions").forGetter(ExampleCodecClass::getBlockPositions)
).apply(instance, ExampleCodecClass::new));
</syntaxhighlight>
===Optional and Default Values in Record Fields===
When RecordCodecBuilder is used as shown above, all of the fields are *required* to be in the serialized object (the JsonObject/CompoundTag/etc), or the entire thing will fail to parse when the codec tries to deserialize it. If we wish to have optional or default values, we have several alternatives of fieldOf() we can use.
* <code>someCodec.optionalFieldOf("field_name")</code> creates a field for an Optional. If the field in the json/nbt is not present or invalid, it will deserialize as an empty optional. Empty optionals will not be serialized; the field will be omitted from the json or nbt.
* <code>someCodec.optionalFieldOf("field_name", someDefaultValue)</code> creates an optional field that deserializes as the given default value if the field is not present in the json/nbt. When serializing, if the field in the java object equals the default value, the value will not be serialized and the field will be omitted from the json or nbt.
When using optional fields, be wary that if the field contains bad data or otherwise fails to serialize, the error will be silently caught, and the field will serialize as the default value instead!
===Boxing values as objects===
In some situations, we may need to serialize a single value as a single-field object. We can use fieldOf to box a single value in this way without needing the entire RecordCodecBuilder process:
<syntaxhighlight lang="java">
public static final Codec<Integer> BOXED_INT_CODEC = Codec.INT.fieldOf("value").codec();
JsonElement value = BOXED_INT_CODEC.encodeStart(JsonOps.INSTANCE, 5).result().get();
</syntaxhighlight>
Which serializes the following output:
<syntaxhighlight lang="json">
{"value":5}
</syntaxhighlight>
==Unit==
The <code>Codec.unit(defaultValue)</code> codec creates a Codec that always deserializes a specified default value, regardless of input. When serializing, it serializes nothing.
==Pair==
The <code>Codec.pair(codecA, codecB)</code> static method takes two codecs and generates a Codec<Pair<A,B>> from them.
The only valid arguments for this method are codecs that serialize to objects with explicit fields, such as codecs created using [[Codecs#Records/1.17|RecordCodecBuilder]] or [[Codecs#Boxing values as objects/1.17|fieldOf]]. Codecs that serialize nothing (such as [[Codecs#Unit/1.17|unit codecs]]) are also valid as they act as objects-with-no-fields.
The resulting Pair codec will serialize a single object that has all of the fields of the two original codecs. For example:
<syntaxhighlight lang="java">
public static final Codec<Pair<Integer,String>> PAIR_CODEC = Codec.pair(
Codec.INT.fieldOf("value").codec(),
Codec.STRING.fieldOf("name").codec());
JsonElement encodedPair = PAIR_CODEC.encodeStart(JsonOps.INSTANCE, Pair.of(5, "cheese").result().get();
</syntaxhighlight>
This codec serializes the above value to:
<syntaxhighlight lang="json">
{
"value": 5,
"name": "cheese"
}
</syntaxhighlight>
Codecs that serialize to objects with undefined fields such as [[Codecs#Maps/1.17|unboundedMap]] may cause strange and unpredictable behaviour when used here; these objects should be boxed via fieldOf when used in a pair codec.
==Either==
The <code>Codec.either(codecA, codecB)</code> static method takes two codecs and generates a Codec<Either<A,B>> from them.
When this codec is used to de/serialize an object, it first attempts to use the first codec; if and only if that conversion fails, it then attempts to use the second codec. If that conversion also fails, then the returned DataResult will contain the error data from the *second* codec's conversion attempt.
==Numeric Ranges==
The <code>Codec.intRange(min,max)</code>, <code>Codec.floatRange(min,max)</code>, and <code>Codec.doubleRange(min,max)</code> static methods generate Codecs for Integers, Floats, or Doubles, respectively, for which only a specified inclusive range is valid, and values outside that range will fail to de/serialize.
==Maps==
Suppose we want to serialize a HashMap or other Map type, where we could have indefinitely many key-value pairs and we don't know what the keys are ahead of time.
We can create a <code>Codec<Map<KEY,VALUE>></code> using the <code>Codec.unboundedMap</code> static method, which takes a key codec and a value codec and creates a codec for a map type:
<syntaxhighlight lang="java">
public static final Codec<Map<String, BlockPos>> = Codec.unboundedMap(Codec.STRING, BlockPos.CODEC);
</syntaxhighlight>
The serialized form of maps serialized by this codec will be a JsonObject or CompoundTag, whose fields are the key-value pairs in the map; the map's keys will be used as the field names, and the map's values will be the values of those fields
A limitation of using unboundedMap is that it only supports key codecs that serialize to Strings (including codecs for things like ResourceLocation that aren't Strings themselves but still serialize to strings). To create a codec for a Map whose keys are not fundamentally strings, the Map must be serialized as a list of key-value pairs instead of using unboundedMap.
==Equivalent Types and xmap==
Suppose we have two java classes, Amalgam and Box; any Amalgam instance can be converted to a Box, and vice-versa. Now suppose we have a Codec<Amalgam>, but we'd also like to have a Codec<Box>. Rather than creating an entirely new codec for Box from scratch, we can simply xmap our Amalgam codec instead.
The <code>Codec#xmap</code> instance method is used to generate a second codec for a fundamentally equivalent type to the first codec's type. The method takes two function objects as arguments, which are used to convert the first type to the second when deserializing, and converting the second type to the first when serializing:
<syntaxhighlight lang="java">
public static final Codec<Box> = Amalgam.CODEC.xmap(Amalgam::toBox, Box::toAmalgam);
</syntaxhighlight>
Codecs created in this manner will serialize objects in the same format as the starting codec.
==Partially Equivalent Types, flatComapMap, comapFlatMap, and flatXMap==
Consider the ResourceLocation: Any ResourceLocation can be converted to a String, but not all Strings can be converted to a ResourceLocation; ResourceLocations have strict limits on their format and allowed characters.
While we *could* use xmap to convert the Codec.STRING to a codec for ResourceLocations, this would cause attempts to parse an invalid string like <code>SHOUTY:MOD:Invalid$Characters</code> to throw a runtime exception, when we really should be returning a failed DataResult to the parser instead -- which indeed is what the vanilla ResourceLocation codec does.
Codecs have three additional instance methods for creating equivalent codecs for when we can *potentially* convert one type to another, but are not guaranteed to be able to do so. These take conversion function arguments that return DataResults, allowing validation to be performed during serialization and deserialization.
{| class="wikitable"
|+ Codec Conversion Methods
|-
! Can A always be converted to B? !! Can B always be converted to A? !! Which method of codecA should be used to create codecB?
|-
| yes || yes || codecA.xmap
|-
| yes || no || codecA.flatComapMap
|-
| no || yes || codecA.comapFlatMap
|-
| no || no || codecA.flatXmap
|}
== Registry Dispatch ==
Registry Dispatch Codecs allow us to define a registry of codecs and delegate to a specific codec to deserialize a particular json based on a type field in that json. Dispatch codecs are used extensively when deserializing worldgen data.
To create a dispatch codec for a Thing class, the following steps can be performed:
# Create a Thing abstract class class and ThingType interface. The ThingType interface should have a method that supplies a Codec<Thing>, while Thing subclasses must define a method that supplies a ThingType.
# Create a map or registry of ThingTypes, and register a ThingType for each sub-codec we want to have.
# Create a Codec<ThingType>, or have the ThingType registry implement Codec.
# Create our Codec<Thing> master codec by invoking <code>Codec#dispatch</code> on our ThingType codec. This method's arguments are:
## A field name for the ID of the sub-codec (the example json below is using "type")
## A function to retrieve a ThingType from a Thing
## A function to retrieve a Codec<Thing> from a ThingType
We can then use our Codec<Thing> to create Thing fields in other codecs whose serialized format depends on the specific sub-codec used by a Thing instance.
As an example of this, consider the ExampleCodecClass earlier. Suppose we make this class extend Thing and register our codec for it to a codec dispatch registry with the id "ourmod:exampleclass". If we were to define an instance of this class in a Thing field in some json, it would look like
<syntaxhighlight lang="json">
"some_thing":
{
"type": "ourmod:exampleclass",
"some_int": 42,
"item": "minecraft:gold_ingot",
"block_positions":
[
[0,0,0],
[10,20,-100]
]
}
</syntaxhighlight>
Other ThingTypes we register would have different fields in this json object, but would still be valid for the "some_thing" field.
Several examples of vanilla classes that use dispatch codecs:
* RuleTest and RuleTestType
* BlockPlacer and BlockPlacerType
* ConfiguredDecorator and FeatureDecorator
=External Links=
* [https://github.com/Mojang/DataFixerUpper/blob/master/src/main/java/com/mojang/serialization/Codec.java Codecs in Mojang's official public DataFixerUpper repository]
* [https://kvverti.github.io/Documented-DataFixerUpper/snapshot/com/mojang/serialization/Codec.html#flatXmap-java.util.function.Function-java.util.function.Function- Unofficial Codec Javadocs]