Difference between revisions of "Codecs"

From Forge Community Wiki
(Basic Codec Rundown)
 
 
(15 intermediate revisions by 4 users not shown)
Line 1: Line 1:
Codecs are an abstraction layer around [[DynamicOps|DynamicOps]] which allow objects to be serialized and deserialized in different contexts such as JSON or NBT. It creates an easy way to read and interpret objects without the need of manual labor.
+
Codecs are a serialization tool from mojang's DataFixerUpper library. Codecs are used alongside [[DynamicOps|DynamicOps]] to allow objects to be serialized to different formats and back, such as JSON or [[Using_NBT|NBT]]. While the DynamicOps describes the format the object is to be serialized to, the Codec describes the manner in which the object is to be serialized; a single Codec can be used to serialize an object to any format for which a DynamicOps exists. Codecs and DynamicOps jointly form an abstraction layer over data serialization, simplifying the effort needed to serialize or deserialize data.
==Codec Serialization and Deserialization==
+
 
Codec Serialization and Deserialization is handled through two main methods: <code>Codec#encodeStart</code> and <code>Codec#parse</code> respectively. Each of these methods returns a <code>DataResult</code> which holds the encoded object type or the decoded object class. An <code>Optional</code> of the resulting output can be grabbed via <code>DataResult#result</code>. If a custom message should be specified along with the error message, that can be specified using <code>DataResult#resultOrPartial</code>. Alternatively,<code>DataResult#getOrThrow</code>can be used which grabs the instance directly instead of an optional.
+
= Using Codecs =
==Creating a Codec for a Class==
+
 
Let's assume there is the following class structure that a codec should be created for:
+
== Serialization and Deserialization ==
 +
 
 +
The primary use for Codecs is to serialize java objects to some serialized type, such as a JsonElement or a Tag, and to deserialize a serialized object back to its proper java type. This is accomplished with <code>Codec#encodeStart</code> and <code>Codec#parse</code>, respectively. Given a Codec<SomeJavaType> and a DynamicOps<SomeSerializedType>, we can convert instances of SomeJavaType to instances of SomeSerializedType and back.
 +
 
 +
Each of these methods take a [[DynamicOps]] instance and an instance of the object we are serializing or deserializing, and returns a DataResult:
 +
 
 +
<syntaxhighlight lang="java">
 +
// let someCodec be a Codec<SomeJavaType>
 +
// let someJavaObject be an instance of SomeJavaType
 +
// let someTag and someJsonElement be instances of Tag and JsonElement, respectively
 +
 
 +
// serialize some java object to Tag
 +
DataResult<Tag> result = someCodec.encodeStart(NBTOps.INSTANCE, someJavaObject);
 +
 
 +
// deserialize some Tag instance back to a proper java object
 +
DataResult<SomeJavaType> result = someCodec.parse(NBTOps.INSTANCE, someTag);
 +
 
 +
// serialize some java object to a JsonElement
 +
DataResult<JsonElement> result = someCodec.encodeStart(JsonOps.INSTANCE, someJavaObject);
 +
 
 +
// deserialize a JsonElement back to a proper java object
 +
DataResult<SomeJavaType> result = someCodec.parse(JsonOps.INSTANCE, someJsonElement);
 +
</syntaxhighlight>
 +
 
 +
A DataResult either holds the converted instance, or it holds some error data, depending on whether the conversion was successful or not, respectively. There are several things we can do with this DataResult; <code>DataResult#result</code> simply returns an Optional containing the converted object if the conversion was successful, while <code>DataResult#resultOrPartial</code> also runs a given function if the conversion was unsuccessful (in addition to returning the Optional). #resultOrPartial is particularly useful for logging errors during datapack deserialization:
 +
 
 +
<syntaxhighlight lang="java">
 +
// deserialize something from json
 +
someCodec.parse(JsonOps.INSTANCE, someJsonElement)
 +
.resultOrPartial(errorMessage -> doSomethingIfBadData(errorMessage))
 +
.ifPresent(someJavaObject -> doSomethingIfGoodData(someJavaObject))
 +
</syntaxhighlight>
 +
 
 +
== Builtin Codecs ==
 +
=== Primitive Codecs ===
 +
The Codec class itself contains static instances of codecs for all supported primitive types, e.g. <code>Codec.STRING</code> is the canonical <code>Codec<String></code> implementation. Primitive codecs include:
 +
* BOOL, which serializes to a boolean value
 +
* BYTE, SHORT, INT, LONG, FLOAT, and DOUBLE, which serialize to numeric values
 +
* STRING, which serializes to a string
 +
* BYTE_BUFFER, INT_STREAM, and LONG_STREAM, which serialize to lists of numbers
 +
* EMPTY, which represents null objects
 +
 
 +
=== Other Builtin Codecs ===
 +
Vanilla minecraft has many builtin codecs for objects that it frequently serializes. These codecs are typically static instances in the class the codec is serializing; e.g. <code>ResourceLocation.CODEC</code> is the canonical <code>Codec<ResourceLocation></code>, while <code>BlockPos.CODEC</code> is the codec used for serializing a BlockPos.
 +
 
 +
Each vanilla <code>Registry</code> acts as the Codec for the type of object the registry contains; e.g. <code>Registry.BLOCK</code> is itself a <code>Codec<Block></code>. Forge Registries, however, do not currently implement Codec and cannot yet be used in this way; custom codecs must be created for forge-specific registries that are not tied to specific vanilla registries.
 +
 
 +
Of particular note here is the CompoundTag.CODEC, which can be used to e.g. serialize a CompoundTag into a json file. This has a notable limitation in that CompoundTag.CODEC *cannot* safely deserialize lists of numbers from json, due to the strong typing of ListTag and the way that the NBTOps deserializer reads numeric values.
 +
 
 +
= Creating Codecs =
 +
 
 +
Suppose we have the following class, and we want to deserialize json files to instances of this class:
 
<syntaxhighlight lang="java">
 
<syntaxhighlight lang="java">
 
public class ExampleCodecClass {
 
public class ExampleCodecClass {
  
     private final int field_1;
+
     private final int someInt;
     private final List<BlockPos> field_2;
+
     private final Item item;
     private final Item field_3;
+
     private final List<BlockPos> blockPositions;
  
     public ExampleCodecClass(int field_1, List<BlockPos> field_2, Item field_3) {...}
+
     public ExampleCodecClass(int someInt, Item item, List<BlockPos> blockPositions) {...}
 +
 
 +
    public int getSomeInt() { return this.someInt; }
 +
    public Item getItem() { return this.item; }
 +
    public List<BlockPos> getBlockPositions() { return this.blockPositions; }
 +
}
 +
</syntaxhighlight>
 +
 
 +
Where a json file for an instance of this class might look like:
 +
 
 +
<syntaxhighlight lang="json">
 +
{
 +
"some_int": 42,
 +
"item": "minecraft:gold_ingot",
 +
"block_positions":
 +
[
 +
[0,0,0],
 +
[10,20,-100]
 +
]
 
}
 
}
 
</syntaxhighlight>
 
</syntaxhighlight>
  
 +
We can assemble a codec for this class by building a new codec out of smaller codecs. We'll need a codec for each of these fields:
 +
* a <code>Codec<Integer></code>
 +
* a <code>Codec<Item></code>
 +
* a <code>Codec<List<BlockPos>></code>
 +
And then we'll need to assemble these into a <code>Codec<ExampleCodecClass></code>.
 +
 +
As previously mentioned, we can use <code>Codec.INT</code> for the integer codec, and <code>Registry.ITEM</code> for the Item codec. We don't have a builtin codec for list-of-blockpos, but we can use BlockPos.CODEC to create one.
 +
 +
==Lists==
 +
The <code>Codec#listOf</code> instance method can be used to generate a codec for a List from an existing codec:
  
For each basic object instance, a codec can be constructed using <code>RecordCodecBuilder::create</code>. This takes in a function that converts an <code>Instance</code> of an object, which is a group of codecs for each serializable field, to an <code>App</code>, which is an unary type constructor for allowing algorithms to be generalized using generics.
 
 
<syntaxhighlight lang="java">
 
<syntaxhighlight lang="java">
public static final Codec<ExampleCodecClass> CODEC = RecordCodecBuilder.create(builder -> {
+
// BlockPos.CODEC is a Codec<BlockPos>
    return ...;
+
Codec<List<BlockPos>> blockPosListCodec = BlockPos.CODEC.listOf();
});
 
 
</syntaxhighlight>
 
</syntaxhighlight>
  
 +
Codecs created via listOf() serialize things to listlike objects, such as [] json arrays or ListTags.
  
To add a list of valid codecs, which is denoted by <code>P</code>where n is the number of fields in the instance,<code>Instance#group</code>is used which takes in codecs converted into an <code>App</code> of some kind. This example will examine three such scenarios.
+
Deserializing a list in this manner produces an ''immutable'' list. If a mutable list is needed, [[Codecs#Equivalent_Types_and_xmap|xmap]] can be used to convert the list after deserializing.
  
 +
== Records ==
 +
RecordCodecBuilder is used to generate codecs that serialize instances of classes with explicitly named fields, like our example above. Codecs created via RecordCodecBuilder serialize things to maplike objects, such as {} json objects or CompoundTags.
  
First, there is a primitive integer field. All primitive codecs are declared within the <code>Codec</code> class along with a few extra primitive streams (in this case we will use <code>Codec#INT</code>). To convert this codec into a valid key-pair form, the parameter name needs to be specified. This can be done using <code>Codec#fieldOf</code> which will take in a string which represents the key of this field. This will convert the codec into a MapCodec which as the name states creates a key-value pair to deserialize the instance from. From there, how to serialize the instance from the class object must also be specified. This can be done using <code>MapCodec#forGetter</code> which takes in a function that converts the class object to the type instance, hence the getter method name. This creates a <code>RecordCodecBuilder</code> which will be the final state of the codec as it is an instance of <code>App</code>.
+
RecordCodecBuilder can be used in several ways, but the simplest form is as follows:
 +
 
 +
<syntaxhighlight lang="java">
 +
public static final Codec<SomeJavaClass> = RecordCodecBuilder.create(instance -> instance.group(
 +
someFieldCodecA.fieldOf("field_name_a").forGetter(SomeJavaClass::getFieldA),
 +
someFieldCodecB.fieldOf("field_name_b").forGetter(SomeJavaClass::getFieldB),
 +
someFieldCodecC.fieldOf("field_name_c").forGetter(SomeJavaClass::getFieldC),
 +
// up to 16 fields can be declared here
 +
).apply(instance, SomeJavaClass::new));
 +
</syntaxhighlight>
  
 +
Where each line in the group specifies a codec instance for the type of that field, the field name in the serialized object, and the corresponding getter function in the java class. The builder is concluded by specifying a constructor or factory for the java class whose arguments are the previously defined fields in the same order.
  
 +
For example, using RecordCodecBuilder to create a codec for our example class above:
 
<syntaxhighlight lang="java">
 
<syntaxhighlight lang="java">
public static final Codec<ExampleCodecClass> CODEC = RecordCodecBuilder.create(builder -> {
+
public static final Codec<ExampleCodecClass> = RecordCodecBuilder.create(instance -> instance.group(
    return builder.group(Codec.INT.fieldOf("field_1").forGetter(obj -> obj.field_1),
+
Codec.INT.fieldOf("some_int").forGetter(ExampleCodecClass::getSomeInt),
      ...)...;
+
Registry.ITEM.fieldOf("item").forGetter(ExampleCodecClass::getItem),
});
+
BlockPos.CODEC.listOf().fieldOf("block_positions").forGetter(ExampleCodecClass::getBlockPositions)
 +
).apply(instance, ExampleCodecClass::new));
 
</syntaxhighlight>
 
</syntaxhighlight>
  
 +
===Optional and Default Values in Record Fields===
 +
When RecordCodecBuilder is used as shown above, all of the fields are *required* to be in the serialized object (the JsonObject/CompoundTag/etc), or the entire thing will fail to parse when the codec tries to deserialize it. If we wish to have optional or default values, we have several alternatives of fieldOf() we can use.
 +
 +
* <code>someCodec.optionalFieldOf("field_name")</code> creates a field for an Optional. If the field in the json/nbt is not present or invalid, it will deserialize as an empty optional. Empty optionals will not be serialized; the field will be omitted from the json or nbt.
 +
* <code>someCodec.optionalFieldOf("field_name", someDefaultValue)</code> creates an optional field that deserializes as the given default value if the field is not present in the json/nbt. When serializing, if the field in the java object equals the default value, the value will not be serialized and the field will be omitted from the json or nbt.
  
Next, there is a list of <code>BlockPos</code> which has a premade codec within itself. However, a <code>Codec</code> needs to be converted into a <code>Codec</code>. Luckily, there are a few helpers within the codec class that allows some of these conversions to be trivial. In this case, <code>Codec#listOf</code> will convert a codec of some generic into a list of that generic. The process for attaching the codec is exactly the same.
+
When using optional fields, be wary that if the field contains bad data or otherwise fails to serialize, the error will be silently caught, and the field will serialize as the default value instead!
  
 +
===Boxing values as objects===
 +
In some situations, we may need to serialize a single value as a single-field object. We can use fieldOf to box a single value in this way without needing the entire RecordCodecBuilder process:
  
 
<syntaxhighlight lang="java">
 
<syntaxhighlight lang="java">
public static final Codec<ExampleCodecClass> CODEC = RecordCodecBuilder.create(builder -> {
+
public static final Codec<Integer> BOXED_INT_CODEC = Codec.INT.fieldOf("value").codec();
    return builder.group(Codec.INT.fieldOf("field_1").forGetter(obj -> obj.field_1),
+
 
      BlockPos.CODEC.listOf().fieldOf("field_2").forGetter(obj -> obj.field_2),
+
JsonElement value = BOXED_INT_CODEC.encodeStart(JsonOps.INSTANCE, 5).result().get();
      ...)...;
+
</syntaxhighlight>
});
+
 
 +
Which serializes the following output:
 +
 
 +
<syntaxhighlight lang="json">
 +
{"value":5}
 
</syntaxhighlight>
 
</syntaxhighlight>
  
 +
==Unit==
 +
The <code>Codec.unit(defaultValue)</code> codec creates a Codec that always deserializes a specified default value, regardless of input. When serializing, it serializes nothing.
 +
 +
==Pair==
 +
The <code>Codec.pair(codecA, codecB)</code> static method takes two codecs and generates a Codec<Pair<A,B>> from them.
  
A few other notable mentions that might be used within a codec:
+
The only valid arguments for this method are codecs that serialize to objects with explicit fields, such as codecs created using [[Codecs#Records|RecordCodecBuilder]] or [[Codecs#Boxing_values_as_objects|fieldOf]]. Codecs that serialize nothing (such as [[Codecs#Unit|unit codecs]]) are also valid as they act as objects-with-no-fields.
{| class="wikitable" style="margin-left: auto; margin-right: auto; width: 1415px;" data-mce-style="margin-left: auto; margin-right: auto; width: 1415px;"
+
 
|-
+
The resulting Pair codec will serialize a single object that has all of the fields of the two original codecs. For example:
| style="width: 52px;" data-mce-style="width: 52px;"|Method
+
<syntaxhighlight lang="java">
| style="width: 640px;" data-mce-style="width: 640px;"|Description
+
public static final Codec<Pair<Integer,String>> PAIR_CODEC = Codec.pair(
|-
+
Codec.INT.fieldOf("value").codec(),
| style="width: 52px;" data-mce-style="width: 52px;"|intRange
+
Codec.STRING.fieldOf("name").codec());
| style="width: 640px;" data-mce-style="width: 640px;"|Creates a integer codec with a valid inclusive range.
+
 
 +
JsonElement encodedPair = PAIR_CODEC.encodeStart(JsonOps.INSTANCE, Pair.of(5, "cheese").result().get();
 +
</syntaxhighlight>
 +
This codec serializes the above value to:
 +
<syntaxhighlight lang="json">
 +
{
 +
"value": 5,
 +
"name": "cheese"
 +
}
 +
</syntaxhighlight>
 +
 
 +
Codecs that serialize to objects with undefined fields such as [[Codecs#Maps|unboundedMap]] may cause strange and unpredictable behaviour when used here; these objects should be boxed via fieldOf when used in a pair codec.
 +
 
 +
==Either==
 +
The <code>Codec.either(codecA, codecB)</code> static method takes two codecs and generates a Codec<Either<A,B>> from them.
 +
 
 +
When this codec is used to de/serialize an object, it first attempts to use the first codec; if and only if that conversion fails, it then attempts to use the second codec. If that conversion also fails, then the returned DataResult will contain the error data from the *second* codec's conversion attempt.
 +
 
 +
==Numeric Ranges==
 +
The <code>Codec.intRange(min,max)</code>, <code>Codec.floatRange(min,max)</code>, and <code>Codec.doubleRange(min,max)</code> static methods generate Codecs for Integers, Floats, or Doubles, respectively, for which only a specified inclusive range is valid, and values outside that range will fail to de/serialize.
 +
 
 +
==Maps==
 +
Suppose we want to serialize a HashMap or other Map type, where we could have indefinitely many key-value pairs and we don't know what the keys are ahead of time.
 +
 
 +
We can create a <code>Codec<Map<KEY,VALUE>></code> using the <code>Codec.unboundedMap</code> static method, which takes a key codec and a value codec and creates a codec for a map type:
 +
 
 +
<syntaxhighlight lang="java">
 +
public static final Codec<Map<String, BlockPos>> = Codec.unboundedMap(Codec.STRING, BlockPos.CODEC);
 +
</syntaxhighlight>
 +
 
 +
The serialized form of maps serialized by this codec will be a JsonObject or CompoundTag, whose fields are the key-value pairs in the map; the map's keys will be used as the field names, and the map's values will be the values of those fields
 +
 
 +
A limitation of using unboundedMap is that it only supports key codecs that serialize to Strings (including codecs for things like ResourceLocation that aren't Strings themselves but still serialize to strings). To create a codec for a Map whose keys are not fundamentally strings, the Map must be serialized as a list of key-value pairs instead of using unboundedMap.
 +
 
 +
==Equivalent Types and xmap==
 +
Suppose we have two java classes, Amalgam and Box; any Amalgam instance can be converted to a Box, and vice-versa. Now suppose we have a Codec<Amalgam>, but we'd also like to have a Codec<Box>. Rather than creating an entirely new codec for Box from scratch, we can simply xmap our Amalgam codec instead.
 +
 
 +
The <code>Codec#xmap</code> instance method is used to generate a second codec for a fundamentally equivalent type to the first codec's type. The method takes two function objects as arguments, which are used to convert the first type to the second when deserializing, and converting the second type to the first when serializing:
 +
 
 +
<syntaxhighlight lang="java">
 +
public static final Codec<Box> = Amalgam.CODEC.xmap(Amalgam::toBox, Box::toAmalgam);
 +
</syntaxhighlight>
 +
 
 +
Codecs created in this manner will serialize objects in the same format as the starting codec.
 +
 
 +
==Partially Equivalent Types, flatComapMap, comapFlatMap, and flatXMap==
 +
Consider the ResourceLocation: Any ResourceLocation can be converted to a String, but not all Strings can be converted to a ResourceLocation; ResourceLocations have strict limits on their format and allowed characters.
 +
 
 +
While we *could* use xmap to convert the Codec.STRING to a codec for ResourceLocations, this would cause attempts to parse an invalid string like <code>SHOUTY:MOD:Invalid$Characters</code> to throw a runtime exception, when we really should be returning a failed DataResult to the parser instead -- which indeed is what the vanilla ResourceLocation codec does.
 +
 
 +
Codecs have three additional instance methods for creating equivalent codecs for when we can *potentially* convert one type to another, but are not guaranteed to be able to do so. These take conversion function arguments that return DataResults, allowing validation to be performed during serialization and deserialization.
 +
 
 +
{| class="wikitable"
 +
|+ Codec Conversion Methods
 
|-
 
|-
| style="width: 52px;" data-mce-style="width: 52px;"|floatRange
+
! Can A always be converted to B? !! Can B always be converted to A? !! Which method of codecA should be used to create codecB?
| style="width: 640px;" data-mce-style="width: 640px;"|Creates a float codec with a valid inclusive range.
 
 
|-
 
|-
| style="width: 52px;" data-mce-style="width: 52px;"|doubleRange
+
| yes || yes || codecA.xmap
| style="width: 640px;" data-mce-style="width: 640px;"|Creates a double codec with a valid inclusive range.
 
 
|-
 
|-
| style="width: 52px;" data-mce-style="width: 52px;"|pair
+
| yes || no || codecA.flatComapMap
| style="width: 640px;" data-mce-style="width: 640px;"|Create a pair using two codecs.
 
 
|-
 
|-
| style="width: 52px;" data-mce-style="width: 52px;"|either
+
| no || yes || codecA.comapFlatMap
| style="width: 640px;" data-mce-style="width: 640px;"|Creates an either (an object with some fallback object) using two codecs.
 
 
|-
 
|-
| style="width: 52px;" data-mce-style="width: 52px;"|unboundedMap
+
| no || no || codecA.flatXmap
| style="width: 640px;" data-mce-style="width: 640px;"|Creates a map using two codecs.
 
 
|}
 
|}
Last, there is a Item. This is optional and should default to <code>Items#AIR</code> when not defined. Here, there will be two techniques used to grab the associated codec. By default, a <code>Registry</code> is an instance of a codec. Therefore, the codec can be grabbed by specifying the registry instance (e.g. <code>Registry#ITEM</code>). What if there is no vanilla registry instance however? Then, another codec method can be used: <code>Codec#xmap</code>. This allows the associated object to be mapped to another object. A function specifies mapping the associated object to the new object for decoding and vice versa for encoding. For example, the <code>ResourceLocation</code> codec can be mapped to an <code>Item</code> through the forge registry instance.
 
  
 +
== Registry Dispatch ==
 +
Registry Dispatch Codecs allow us to define a registry of codecs and delegate to a specific codec to deserialize a particular json based on a type field in that json. Dispatch codecs are used extensively when deserializing worldgen data.
 +
 +
To create a dispatch codec for a Thing class, the following steps can be performed:
 +
# Create a Thing abstract class class and ThingType interface. The ThingType interface should have a method that supplies a Codec<Thing>, while Thing subclasses must define a method that supplies a ThingType.
 +
# Create a map or registry of ThingTypes, and register a ThingType for each sub-codec we want to have.
 +
# Create a Codec<ThingType>, or have the ThingType registry implement Codec.
 +
# Create our Codec<Thing> master codec by invoking <code>Codec#dispatch</code> on our ThingType codec. This method's arguments are:
 +
## A field name for the ID of the sub-codec (the example json below is using "type")
 +
## A function to retrieve a ThingType from a Thing
 +
## A function to retrieve a Codec<Thing> from a ThingType
 +
 +
We can then use our Codec<Thing> to create Thing fields in other codecs whose serialized format depends on the specific sub-codec used by a Thing instance.
 +
 +
As an example of this, consider the ExampleCodecClass earlier. Suppose we make this class extend Thing and register our codec for it to a codec dispatch registry with the id "ourmod:exampleclass". If we were to define an instance of this class in a Thing field in some json, it would look like
 +
 +
<syntaxhighlight lang="json">
 +
"some_thing":
 +
{
 +
"type": "ourmod:exampleclass",
 +
"some_int": 42,
 +
"item": "minecraft:gold_ingot",
 +
"block_positions":
 +
[
 +
[0,0,0],
 +
[10,20,-100]
 +
]
 +
}
 +
</syntaxhighlight>
 +
 +
Other ThingTypes we register would have different fields in this json object, but would still be valid for the "some_thing" field.
  
To define a field as optional, <code>Codec#optionalFieldOf</code> should be used. One instance holds the value as an <code>Optional</code> while the other allows a defined default value.
+
Several examples of vanilla classes that use dispatch codecs:
 +
* RuleTest and RuleTestType
 +
* BlockPlacer and BlockPlacerType
 +
* ConfiguredDecorator and FeatureDecorator
  
 +
=== Registering MapCodecCodecs for Dispatch Subcodecs ===
 +
 +
When registering a subcodec to any dispatch codec registry, the registered subcodec should be an instance of MapCodecCodec, or the subcodec will be nested in its own object when serialized.
 +
 +
For example, suppose we register a single-field subcodec, where we use fieldOf-and-xmap to convert an int to our int-holding Thing:
  
 
<syntaxhighlight lang="java">
 
<syntaxhighlight lang="java">
public static final Codec<ExampleCodecClass> CODEC = RecordCodecBuilder.create(builder -> {
+
public record Thing(int n){}
    return builder.group(Codec.INT.fieldOf("field_1").forGetter(obj -> obj.field_1),
+
public static Codec<Thing> CODEC = Codec.INT.fieldOf("n").codec().xmap(Thing::new, Thing::n);
      BlockPos.CODEC.listOf().fieldOf("field_2").forGetter(obj -> obj.field_2),
 
      ResourceLocation.CODEC.xmap(loc -> ForgeRegistries.ITEMS.getValue(loc), item -> item.getRegistryName()).optionalFieldOf("field_3", Items.AIR).forGetter(obj -> obj.field_3))...;
 
});
 
 
</syntaxhighlight>
 
</syntaxhighlight>
  
 +
This results in this json when serialized:
 +
 +
<syntaxhighlight lang="json">
 +
"some_thing":
 +
{
 +
"type": "ourmod:thing",
 +
"value": {
 +
"n": 5
 +
}
 +
}
 +
</syntaxhighlight>
  
Now, there is a product <code>P3</code> as there is three parameters. To convert this into an <code>App</code>, the method <code>P#apply</code> should be called. This takes in an <code>Applicative</code> which our builder is an instance of and a function that returns the class object given the specified wrapped arguments. Creating a new constructor is one way of creating the outputted codec for the class object.
+
This occurs because xmap does not produce a MapCodecCodec, and if this nested object is not desired, then our registered subcodec must be a MapCodecCodec.
  
 +
However, fieldOf() produces a MapCodec, which has a codec() method, which does produce a MapCodecCodec. We can rearrange our codec builder, which then produces a cleaner json:
  
 
<syntaxhighlight lang="java">
 
<syntaxhighlight lang="java">
public static final Codec<ExampleCodecClass> CODEC = RecordCodecBuilder.create(builder -> {
+
public static Codec<Thing> CODEC = Codec.INT // Primitive codec
    return builder.group(Codec.INT.fieldOf("field_1").forGetter(obj -> obj.field_1),
+
.fieldOf("n") // MapCodec
      BlockPos.CODEC.listOf().fieldOf("field_2").forGetter(obj -> obj.field_2),
+
.xmap(Thing::new, Thing::n) // MapCodec
      ResourceLocation.CODEC.xmap(loc -> ForgeRegistries.ITEMS.getValue(loc), item -> item.getRegistryName()).optionalFieldOf("field_3", Items.AIR).forGetter(obj -> obj.field_3))
+
.codec(); // MapCodecCodec! That's what we want.
      .apply(builder, ExampleCodecClass::new);
+
</syntaxhighlight>
});
+
 
 +
<syntaxhighlight lang="json">
 +
"some_thing":
 +
{
 +
"type": "ourmod:thing",
 +
"n": 5
 +
}
 
</syntaxhighlight>
 
</syntaxhighlight>
==Limitations==
 
Codecs are required to abide by a String-Object key-pair. Any codec that does not have a String key will throw an error during encoding and decoding.
 
  
 +
RecordCodecBuilder also produces MapCodecCodecs.
  
A group can have at most 16 inner codecs normally. This limitation is specified by the number of product generic classes available.
+
=External Links=
 +
* [https://github.com/Mojang/DataFixerUpper/blob/master/src/main/java/com/mojang/serialization/Codec.java Codecs in Mojang's official public DataFixerUpper repository]
 +
* [https://kvverti.github.io/Documented-DataFixerUpper/snapshot/com/mojang/serialization/Codec.html#flatXmap-java.util.function.Function-java.util.function.Function- Unofficial Codec Javadocs]

Latest revision as of 18:45, 10 January 2024

Codecs are a serialization tool from mojang's DataFixerUpper library. Codecs are used alongside DynamicOps to allow objects to be serialized to different formats and back, such as JSON or NBT. While the DynamicOps describes the format the object is to be serialized to, the Codec describes the manner in which the object is to be serialized; a single Codec can be used to serialize an object to any format for which a DynamicOps exists. Codecs and DynamicOps jointly form an abstraction layer over data serialization, simplifying the effort needed to serialize or deserialize data.

Using Codecs

Serialization and Deserialization

The primary use for Codecs is to serialize java objects to some serialized type, such as a JsonElement or a Tag, and to deserialize a serialized object back to its proper java type. This is accomplished with Codec#encodeStart and Codec#parse, respectively. Given a Codec<SomeJavaType> and a DynamicOps<SomeSerializedType>, we can convert instances of SomeJavaType to instances of SomeSerializedType and back.

Each of these methods take a DynamicOps instance and an instance of the object we are serializing or deserializing, and returns a DataResult:

// let someCodec be a Codec<SomeJavaType>
// let someJavaObject be an instance of SomeJavaType
// let someTag and someJsonElement be instances of Tag and JsonElement, respectively

// serialize some java object to Tag
DataResult<Tag> result = someCodec.encodeStart(NBTOps.INSTANCE, someJavaObject);

// deserialize some Tag instance back to a proper java object
DataResult<SomeJavaType> result = someCodec.parse(NBTOps.INSTANCE, someTag);

// serialize some java object to a JsonElement
DataResult<JsonElement> result = someCodec.encodeStart(JsonOps.INSTANCE, someJavaObject);

// deserialize a JsonElement back to a proper java object
DataResult<SomeJavaType> result = someCodec.parse(JsonOps.INSTANCE, someJsonElement);

A DataResult either holds the converted instance, or it holds some error data, depending on whether the conversion was successful or not, respectively. There are several things we can do with this DataResult; DataResult#result simply returns an Optional containing the converted object if the conversion was successful, while DataResult#resultOrPartial also runs a given function if the conversion was unsuccessful (in addition to returning the Optional). #resultOrPartial is particularly useful for logging errors during datapack deserialization:

// deserialize something from json
someCodec.parse(JsonOps.INSTANCE, someJsonElement)
	.resultOrPartial(errorMessage -> doSomethingIfBadData(errorMessage))
	.ifPresent(someJavaObject -> doSomethingIfGoodData(someJavaObject))

Builtin Codecs

Primitive Codecs

The Codec class itself contains static instances of codecs for all supported primitive types, e.g. Codec.STRING is the canonical Codec<String> implementation. Primitive codecs include:

  • BOOL, which serializes to a boolean value
  • BYTE, SHORT, INT, LONG, FLOAT, and DOUBLE, which serialize to numeric values
  • STRING, which serializes to a string
  • BYTE_BUFFER, INT_STREAM, and LONG_STREAM, which serialize to lists of numbers
  • EMPTY, which represents null objects

Other Builtin Codecs

Vanilla minecraft has many builtin codecs for objects that it frequently serializes. These codecs are typically static instances in the class the codec is serializing; e.g. ResourceLocation.CODEC is the canonical Codec<ResourceLocation>, while BlockPos.CODEC is the codec used for serializing a BlockPos.

Each vanilla Registry acts as the Codec for the type of object the registry contains; e.g. Registry.BLOCK is itself a Codec<Block>. Forge Registries, however, do not currently implement Codec and cannot yet be used in this way; custom codecs must be created for forge-specific registries that are not tied to specific vanilla registries.

Of particular note here is the CompoundTag.CODEC, which can be used to e.g. serialize a CompoundTag into a json file. This has a notable limitation in that CompoundTag.CODEC *cannot* safely deserialize lists of numbers from json, due to the strong typing of ListTag and the way that the NBTOps deserializer reads numeric values.

Creating Codecs

Suppose we have the following class, and we want to deserialize json files to instances of this class:

public class ExampleCodecClass {

    private final int someInt;
    private final Item item;
    private final List<BlockPos> blockPositions;

    public ExampleCodecClass(int someInt, Item item, List<BlockPos> blockPositions) {...}

    public int getSomeInt() { return this.someInt; }
    public Item getItem() { return this.item; }
    public List<BlockPos> getBlockPositions() { return this.blockPositions; }
}

Where a json file for an instance of this class might look like:

{
	"some_int": 42,
	"item": "minecraft:gold_ingot",
	"block_positions":
	[
		[0,0,0],
		[10,20,-100]
	]
}

We can assemble a codec for this class by building a new codec out of smaller codecs. We'll need a codec for each of these fields:

  • a Codec<Integer>
  • a Codec<Item>
  • a Codec<List<BlockPos>>

And then we'll need to assemble these into a Codec<ExampleCodecClass>.

As previously mentioned, we can use Codec.INT for the integer codec, and Registry.ITEM for the Item codec. We don't have a builtin codec for list-of-blockpos, but we can use BlockPos.CODEC to create one.

Lists

The Codec#listOf instance method can be used to generate a codec for a List from an existing codec:

// BlockPos.CODEC is a Codec<BlockPos>
Codec<List<BlockPos>> blockPosListCodec = BlockPos.CODEC.listOf();

Codecs created via listOf() serialize things to listlike objects, such as [] json arrays or ListTags.

Deserializing a list in this manner produces an immutable list. If a mutable list is needed, xmap can be used to convert the list after deserializing.

Records

RecordCodecBuilder is used to generate codecs that serialize instances of classes with explicitly named fields, like our example above. Codecs created via RecordCodecBuilder serialize things to maplike objects, such as {} json objects or CompoundTags.

RecordCodecBuilder can be used in several ways, but the simplest form is as follows:

public static final Codec<SomeJavaClass> = RecordCodecBuilder.create(instance -> instance.group(
		someFieldCodecA.fieldOf("field_name_a").forGetter(SomeJavaClass::getFieldA),
		someFieldCodecB.fieldOf("field_name_b").forGetter(SomeJavaClass::getFieldB),
		someFieldCodecC.fieldOf("field_name_c").forGetter(SomeJavaClass::getFieldC),
		// up to 16 fields can be declared here
	).apply(instance, SomeJavaClass::new));

Where each line in the group specifies a codec instance for the type of that field, the field name in the serialized object, and the corresponding getter function in the java class. The builder is concluded by specifying a constructor or factory for the java class whose arguments are the previously defined fields in the same order.

For example, using RecordCodecBuilder to create a codec for our example class above:

public static final Codec<ExampleCodecClass> = RecordCodecBuilder.create(instance -> instance.group(
		Codec.INT.fieldOf("some_int").forGetter(ExampleCodecClass::getSomeInt),
		Registry.ITEM.fieldOf("item").forGetter(ExampleCodecClass::getItem),
		BlockPos.CODEC.listOf().fieldOf("block_positions").forGetter(ExampleCodecClass::getBlockPositions)
	).apply(instance, ExampleCodecClass::new));

Optional and Default Values in Record Fields

When RecordCodecBuilder is used as shown above, all of the fields are *required* to be in the serialized object (the JsonObject/CompoundTag/etc), or the entire thing will fail to parse when the codec tries to deserialize it. If we wish to have optional or default values, we have several alternatives of fieldOf() we can use.

  • someCodec.optionalFieldOf("field_name") creates a field for an Optional. If the field in the json/nbt is not present or invalid, it will deserialize as an empty optional. Empty optionals will not be serialized; the field will be omitted from the json or nbt.
  • someCodec.optionalFieldOf("field_name", someDefaultValue) creates an optional field that deserializes as the given default value if the field is not present in the json/nbt. When serializing, if the field in the java object equals the default value, the value will not be serialized and the field will be omitted from the json or nbt.

When using optional fields, be wary that if the field contains bad data or otherwise fails to serialize, the error will be silently caught, and the field will serialize as the default value instead!

Boxing values as objects

In some situations, we may need to serialize a single value as a single-field object. We can use fieldOf to box a single value in this way without needing the entire RecordCodecBuilder process:

public static final Codec<Integer> BOXED_INT_CODEC = Codec.INT.fieldOf("value").codec();

JsonElement value = BOXED_INT_CODEC.encodeStart(JsonOps.INSTANCE, 5).result().get();

Which serializes the following output:

{"value":5}

Unit

The Codec.unit(defaultValue) codec creates a Codec that always deserializes a specified default value, regardless of input. When serializing, it serializes nothing.

Pair

The Codec.pair(codecA, codecB) static method takes two codecs and generates a Codec<Pair<A,B>> from them.

The only valid arguments for this method are codecs that serialize to objects with explicit fields, such as codecs created using RecordCodecBuilder or fieldOf. Codecs that serialize nothing (such as unit codecs) are also valid as they act as objects-with-no-fields.

The resulting Pair codec will serialize a single object that has all of the fields of the two original codecs. For example:

public static final Codec<Pair<Integer,String>> PAIR_CODEC = Codec.pair(
	Codec.INT.fieldOf("value").codec(),
	Codec.STRING.fieldOf("name").codec());

JsonElement encodedPair = PAIR_CODEC.encodeStart(JsonOps.INSTANCE, Pair.of(5, "cheese").result().get();

This codec serializes the above value to:

{
	"value": 5,
	"name": "cheese"
}

Codecs that serialize to objects with undefined fields such as unboundedMap may cause strange and unpredictable behaviour when used here; these objects should be boxed via fieldOf when used in a pair codec.

Either

The Codec.either(codecA, codecB) static method takes two codecs and generates a Codec<Either<A,B>> from them.

When this codec is used to de/serialize an object, it first attempts to use the first codec; if and only if that conversion fails, it then attempts to use the second codec. If that conversion also fails, then the returned DataResult will contain the error data from the *second* codec's conversion attempt.

Numeric Ranges

The Codec.intRange(min,max), Codec.floatRange(min,max), and Codec.doubleRange(min,max) static methods generate Codecs for Integers, Floats, or Doubles, respectively, for which only a specified inclusive range is valid, and values outside that range will fail to de/serialize.

Maps

Suppose we want to serialize a HashMap or other Map type, where we could have indefinitely many key-value pairs and we don't know what the keys are ahead of time.

We can create a Codec<Map<KEY,VALUE>> using the Codec.unboundedMap static method, which takes a key codec and a value codec and creates a codec for a map type:

public static final Codec<Map<String, BlockPos>> = Codec.unboundedMap(Codec.STRING, BlockPos.CODEC);

The serialized form of maps serialized by this codec will be a JsonObject or CompoundTag, whose fields are the key-value pairs in the map; the map's keys will be used as the field names, and the map's values will be the values of those fields

A limitation of using unboundedMap is that it only supports key codecs that serialize to Strings (including codecs for things like ResourceLocation that aren't Strings themselves but still serialize to strings). To create a codec for a Map whose keys are not fundamentally strings, the Map must be serialized as a list of key-value pairs instead of using unboundedMap.

Equivalent Types and xmap

Suppose we have two java classes, Amalgam and Box; any Amalgam instance can be converted to a Box, and vice-versa. Now suppose we have a Codec<Amalgam>, but we'd also like to have a Codec<Box>. Rather than creating an entirely new codec for Box from scratch, we can simply xmap our Amalgam codec instead.

The Codec#xmap instance method is used to generate a second codec for a fundamentally equivalent type to the first codec's type. The method takes two function objects as arguments, which are used to convert the first type to the second when deserializing, and converting the second type to the first when serializing:

public static final Codec<Box> = Amalgam.CODEC.xmap(Amalgam::toBox, Box::toAmalgam);

Codecs created in this manner will serialize objects in the same format as the starting codec.

Partially Equivalent Types, flatComapMap, comapFlatMap, and flatXMap

Consider the ResourceLocation: Any ResourceLocation can be converted to a String, but not all Strings can be converted to a ResourceLocation; ResourceLocations have strict limits on their format and allowed characters.

While we *could* use xmap to convert the Codec.STRING to a codec for ResourceLocations, this would cause attempts to parse an invalid string like SHOUTY:MOD:Invalid$Characters to throw a runtime exception, when we really should be returning a failed DataResult to the parser instead -- which indeed is what the vanilla ResourceLocation codec does.

Codecs have three additional instance methods for creating equivalent codecs for when we can *potentially* convert one type to another, but are not guaranteed to be able to do so. These take conversion function arguments that return DataResults, allowing validation to be performed during serialization and deserialization.

Codec Conversion Methods
Can A always be converted to B? Can B always be converted to A? Which method of codecA should be used to create codecB?
yes yes codecA.xmap
yes no codecA.flatComapMap
no yes codecA.comapFlatMap
no no codecA.flatXmap

Registry Dispatch

Registry Dispatch Codecs allow us to define a registry of codecs and delegate to a specific codec to deserialize a particular json based on a type field in that json. Dispatch codecs are used extensively when deserializing worldgen data.

To create a dispatch codec for a Thing class, the following steps can be performed:

  1. Create a Thing abstract class class and ThingType interface. The ThingType interface should have a method that supplies a Codec<Thing>, while Thing subclasses must define a method that supplies a ThingType.
  2. Create a map or registry of ThingTypes, and register a ThingType for each sub-codec we want to have.
  3. Create a Codec<ThingType>, or have the ThingType registry implement Codec.
  4. Create our Codec<Thing> master codec by invoking Codec#dispatch on our ThingType codec. This method's arguments are:
    1. A field name for the ID of the sub-codec (the example json below is using "type")
    2. A function to retrieve a ThingType from a Thing
    3. A function to retrieve a Codec<Thing> from a ThingType

We can then use our Codec<Thing> to create Thing fields in other codecs whose serialized format depends on the specific sub-codec used by a Thing instance.

As an example of this, consider the ExampleCodecClass earlier. Suppose we make this class extend Thing and register our codec for it to a codec dispatch registry with the id "ourmod:exampleclass". If we were to define an instance of this class in a Thing field in some json, it would look like

"some_thing":
{
	"type": "ourmod:exampleclass",
	"some_int": 42,
	"item": "minecraft:gold_ingot",
	"block_positions":
	[
		[0,0,0],
		[10,20,-100]
	]
}

Other ThingTypes we register would have different fields in this json object, but would still be valid for the "some_thing" field.

Several examples of vanilla classes that use dispatch codecs:

  • RuleTest and RuleTestType
  • BlockPlacer and BlockPlacerType
  • ConfiguredDecorator and FeatureDecorator

Registering MapCodecCodecs for Dispatch Subcodecs

When registering a subcodec to any dispatch codec registry, the registered subcodec should be an instance of MapCodecCodec, or the subcodec will be nested in its own object when serialized.

For example, suppose we register a single-field subcodec, where we use fieldOf-and-xmap to convert an int to our int-holding Thing:

public record Thing(int n){}
public static Codec<Thing> CODEC = Codec.INT.fieldOf("n").codec().xmap(Thing::new, Thing::n);

This results in this json when serialized:

"some_thing":
{
	"type": "ourmod:thing",
	"value": {
		"n": 5
	}
}

This occurs because xmap does not produce a MapCodecCodec, and if this nested object is not desired, then our registered subcodec must be a MapCodecCodec.

However, fieldOf() produces a MapCodec, which has a codec() method, which does produce a MapCodecCodec. We can rearrange our codec builder, which then produces a cleaner json:

public static Codec<Thing> CODEC = Codec.INT // Primitive codec
	.fieldOf("n") // MapCodec
	.xmap(Thing::new, Thing::n) // MapCodec
	.codec(); // MapCodecCodec! That's what we want.
"some_thing":
{
	"type": "ourmod:thing",
	"n": 5
}

RecordCodecBuilder also produces MapCodecCodecs.

External Links