Changes

7,926 bytes added ,  23:06, 4 January 2021
Expand and rewrite, describe codec utilities in greater detail
Line 1: Line 1: −
Codecs are an abstraction layer around [[DynamicOps|DynamicOps]] which allow objects to be serialized and deserialized in different contexts such as JSON or NBT. It creates an easy way to read and interpret objects without the need of manual labor.
+
Codecs are a serialization tool from mojang's DataFixerUpper library. Codecs are used alongside [[DynamicOps|DynamicOps]] to allow objects to be serialized to different formats and back, such as JSON or NBT. While the DynamicOps describes the format the object is to be serialized to, the Codec describes the manner in which the object is to be serialized; a single Codec can be used to serialize an object to any format for which a DynamicOps exists. Codecs and DynamicOps jointly form an abstraction layer over data serialization, simplifying the effort needed to serialize or deserialize data.
==Codec Serialization and Deserialization==
+
 
Codec Serialization and Deserialization is handled through two main methods: <code>Codec#encodeStart</code> and <code>Codec#parse</code> respectively. Each of these methods returns a <code>DataResult</code> which holds the encoded object type or the decoded object class. An <code>Optional</code> of the resulting output can be grabbed via <code>DataResult#result</code>. If a custom message should be specified along with the error message, that can be specified using <code>DataResult#resultOrPartial</code>. Alternatively,<code>DataResult#getOrThrow</code>can be used which grabs the instance directly instead of an optional.
+
= Using Codecs =
==Creating a Codec for a Class==
+
 
Let's assume there is the following class structure that a codec should be created for:
+
== Serialization and Deserialization ==
 +
 
 +
The primary use for Codecs is to serialize java objects to some serialized type, such as a JsonElement or an INBT, and to deserialize an serialized object back to its proper java type. This is accomplished with <code>Codec#encodeStart</code> and <code>Codec#parse</code>, respectively. Given a Codec<SomeJavaType> and a DynamicOps<SomeSerializedType>, we can convert instances of SomeJavaType to instances of SomeSerializedType and back.
 +
 
 +
Each of these methods take a [[DynamicOps]] instance and an instance of the object we are serializing or deserializing, and returns a DataResult:
 +
 
 +
<syntaxhighlight lang="java">
 +
// let someCodec be a Codec<SomeJavaType>
 +
// let someJavaObject be an instance of SomeJavaType
 +
// let someNBT and someJsonElement be instances of INBT and JsonElement, respectively
 +
 
 +
// serialize some java object to INBT
 +
DataResult<INBT> result = someCodec.encodeStart(NBTDynamicOps.INSTANCE, someJavaObject);
 +
 
 +
// deserialize some INBT instance back to a proper java object
 +
DataResult<SomeJavaType> result = someCodec.parse(NBTDynamicOps.INSTANCE, someNBT);
 +
 
 +
// serialize some java object to a JsonElement
 +
DataResult<JsonElement> result = someCodec.encodeStart(JsonOps.INSTANCE, someJavaObject);
 +
 
 +
// deserialize a JsonElement back to a proper java object
 +
DataResult<SomeJavaType> result = someCodec.parse(JsonOps.INSTANCE, someJsonElement);
 +
</syntaxhighlight>
 +
 
 +
A DataResult either holds the converted instance, or it holds some error data, depending on whether the conversion was successful or not, respectively. There are several things we can do with this DataResult; <code>DataResult#result</code> simply returns an Optional containing the converted object if the conversion was successful, while <code>DataResult#resultOrPartial</code> also runs a given function if the conversion was unsuccessful (in addition to returning the Optional). #resultOrPartial is particularly useful for logging errors during datapack deserialization:
 +
 
 +
<syntaxhighlight lang="java">
 +
// deserialize something from json
 +
someCodec.parse(JsonOps.INSTANCE, someJsonElement)
 +
.resultOrPartial(errorMessage -> doSomethingIfBadData(errorMessage))
 +
.ifPresent(someJavaObject -> doSomethingIfGoodData(someJavaObject))
 +
</syntaxhighlight>
 +
 
 +
== Builtin Codecs ==
 +
=== Primitive Codecs ===
 +
The Codec class itself contains static instances of codecs for all supported primitive types, e.g. <code>Codec.STRING</code> is the canonical <code>Codec<String></code> implementation. Primitive codecs include:
 +
* BOOL, which serializes to a boolean value
 +
* BYTE, SHORT, INT, LONG, FLOAT, and DOUBLE, which serialize to numeric values
 +
* STRING, which serializes to a string
 +
* BYTE_BUFFER, INT_STREAM, and LONG_STREAM, which serialize to lists of numbers
 +
* EMPTY, which represents null objects
 +
 
 +
=== Other Builtin Codecs ===
 +
Vanilla minecraft has many builtin codecs for objects that it frequently serializes. These codecs are typically static instances in the class the codec is serializing; e.g. <code>ResourceLocation.CODEC</code> is the canonical <code>Codec<ResourceLocation></code>, while <code>BlockPos.CODEC</code> is the codec used for serializing a BlockPos.
 +
 
 +
Each vanilla <code>Registry</code> acts as the Codec for the type of object the registry contains; e.g. <code>Registry.BLOCK</code> is itself a <code>Codec<Block></code>. Forge Registries, however, do not currently implement Codec and cannot yet be used in this way; custom codecs must be created for forge-specific registries that are not tied to specific vanilla registries.
 +
 
 +
Of particular note here is the CompoundNBT.CODEC, which can be used to e.g. serialize a CompoundNBT into a json file. This has a notable limitation in that CompoundNBT.CODEC *cannot* safely deserialize lists of numbers from json, due to the strong typing of ListNBT and the way that the NBTDynamicOps deserializer reads numeric values.
 +
 
 +
= Creating Codecs =
 +
 
 +
Suppose we have the following class, and we want to deserialize json files to instances of this class:
 
<syntaxhighlight lang="java">
 
<syntaxhighlight lang="java">
 
public class ExampleCodecClass {
 
public class ExampleCodecClass {
   −
     private final int field_1;
+
     private final int someInt;
     private final List<BlockPos> field_2;
+
     private final Item item;
     private final Item field_3;
+
     private final List<BlockPos> blockPositions;
   −
     public ExampleCodecClass(int field_1, List<BlockPos> field_2, Item field_3) {...}
+
     public ExampleCodecClass(int someInt, Item item, List<BlockPos> blockPositions) {...}
 +
 
 +
    public int getSomeInt() { return this.someInt; }
 +
    public Item getItem() { return this.someItem; }
 +
    public List<BlockPos> getBlockPositions() { return this.blockPositions; }
 
}
 
}
 
</syntaxhighlight>
 
</syntaxhighlight>
For each basic object instance, a codec can be constructed using <code>RecordCodecBuilder::create</code>. This takes in a function that converts an <code>Instance</code> of an object, which is a group of codecs for each serializable field, to an <code>App</code>, which is an unary type constructor for allowing algorithms to be generalized using generics.
+
 
 +
Where a json file for an instance of this class might look like:
 +
 
 +
<syntaxhighlight lang="json">
 +
{
 +
"some_int": 42,
 +
"item": "minecraft:gold_ingot",
 +
"block_positions":
 +
[
 +
[0,0,0],
 +
[10,20,-100]
 +
]
 +
}
 +
</syntaxhighlight>
 +
 
 +
We can assemble a codec for this class by building a new codec out of smaller codecs. We'll need a codec for each of these fields:
 +
* a <code>Codec<Integer></code>
 +
* a <code>Codec<Item></code>
 +
* a <code>Codec<List<BlockPos>></code>
 +
And then we'll need to assemble these into a <code>Codec<ExampleCodecClass</code>.
 +
 
 +
As previously mentioned, we can use <code>Codec.INT</code> for the integer codec, and <code>Registry.ITEM</code> for the Item codec. We don't have a builtin codec for list-of-blockpos, but we can use BlockPos.CODEC to create one.
 +
 
 +
== Lists ==
 +
The <code>Codec#listOf</code> instance method can be used to generate a codec for a List from an existing codec:
 +
 
 
<syntaxhighlight lang="java">
 
<syntaxhighlight lang="java">
public static final Codec<ExampleCodecClass> CODEC = RecordCodecBuilder.create(builder -> {
+
// BlockPos.CODEC is a Codec<BlockPos>
    return ...;
+
Codec<List<BlockPos>> = BlockPos.CODEC.listOf();
});
   
</syntaxhighlight>
 
</syntaxhighlight>
To add a list of valid codecs, which is denoted by <code>P</code>where n is the number of fields in the instance,<code>Instance#group</code>is used which takes in codecs converted into an <code>App</code> of some kind. This example will examine three such scenarios.
     −
First, there is a primitive integer field. All primitive codecs are declared within the <code>Codec</code> class along with a few extra primitive streams (in this case we will use <code>Codec#INT</code>). To convert this codec into a valid key-pair form, the parameter name needs to be specified. This can be done using <code>Codec#fieldOf</code> which will take in a string which represents the key of this field. This will convert the codec into a MapCodec which as the name states creates a key-value pair to deserialize the instance from. From there, how to serialize the instance from the class object must also be specified. This can be done using <code>MapCodec#forGetter</code> which takes in a function that converts the class object to the type instance, hence the getter method name. This creates a <code>RecordCodecBuilder</code> which will be the final state of the codec as it is an instance of <code>App</code>.
+
Codecs created via listOf() serialize things to listlike objects, such as [] json arrays or ListNBTs.
 +
 
 +
== Records ==
 +
RecordCodecBuilder is used to generate codecs that serialize instances of classes with explicitly named fields, like our example above. Codecs created via RecordCodecBuilder serialize things to maplike objects, such as {} json objects or CompoundNBTs.
 +
 
 +
RecordCodecBuilder can be used in several ways, but the simplest form is as follows:
 +
 
 
<syntaxhighlight lang="java">
 
<syntaxhighlight lang="java">
public static final Codec<ExampleCodecClass> CODEC = RecordCodecBuilder.create(builder -> {
+
public static final Codec<SomeJavaClass> = RecordCodecBuilder.create(instance -> instance.group(
    return builder.group(Codec.INT.fieldOf("field_1").forGetter(obj -> obj.field_1),
+
someFieldCodecA.fieldOf("field_name_a").forGetter(SomeJavaClass::getFieldA),
      ...)...;
+
someFieldCodecB.fieldOf("field_name_b").forGetter(SomeJavaClass::getFieldB),
});
+
someFieldCodecC.fieldOf("field_name_c").forGetter(SomeJavaClass::getFieldC),
 +
// up to 16 fields can be declared here
 +
).apply(instance, SomeJavaClass::new));
 
</syntaxhighlight>
 
</syntaxhighlight>
Next, there is a list of <code>BlockPos</code> which has a premade codec within itself. However, a <code>Codec</code> needs to be converted into a <code>Codec</code>. Luckily, there are a few helpers within the codec class that allows some of these conversions to be trivial. In this case, <code>Codec#listOf</code> will convert a codec of some generic into a list of that generic. The process for attaching the codec is exactly the same.
+
 
 +
Where each line in the group specifies a codec instance for the type of that field, the field name in the serialized object, and the corresponding getter function in the java class. The builder is concluded by specifying a constructor or factory for the java class whose arguments are the previously defined fields in the same order.
 +
 
 +
For example, using RecordCodecBuilder to create a codec for our example class above:
 
<syntaxhighlight lang="java">
 
<syntaxhighlight lang="java">
public static final Codec<ExampleCodecClass> CODEC = RecordCodecBuilder.create(builder -> {
+
public static final Codec<ExampleCodecClass> = RecordCodecBuilder.create(instance -> instance.group(
    return builder.group(Codec.INT.fieldOf("field_1").forGetter(obj -> obj.field_1),
+
Codec.INT.fieldOf("some_int").forGetter(ExampleCodecClass::getSomeInt),
      BlockPos.CODEC.listOf().fieldOf("field_2").forGetter(obj -> obj.field_2),
+
Registry.ITEM.fieldOf("item").forGetter(ExampleCodecClass::getItem),
      ...)...;
+
BlockPos.CODEC.listOf().fieldOf("block_positions").forGetter(ExampleCodecClass::getBlockPositions)
});
+
).apply(instance, ExampleCodecClass::new));
 
</syntaxhighlight>
 
</syntaxhighlight>
A few other notable mentions that might be used within a codec:
+
 
{| class="wikitable" style="margin-left: auto; margin-right: auto; width: 1415px;" data-mce-style="margin-left: auto; margin-right: auto; width: 1415px;"
+
===Optional and Default Values in Record Fields===
|-
+
When RecordCodecBuilder is used as shown above, all of the fields are *required* to be in the serialized object (the JsonObject/CompoundNBT/etc), or the entire thing will fail to parse when the codec tries to deserialize it. If we wish to have optional or default values, we have several alternatives of fieldOf() we can use.
| style="width: 52px;" data-mce-style="width: 52px;"|Method
+
 
| style="width: 640px;" data-mce-style="width: 640px;"|Description
+
* <code>someCodec.optionalFieldOf("field_name")</code> creates a field for an Optional. If the field in the json/nbt is not present or invalid, it will deserialize as an empty optional. Empty optionals will not be serialized; the field will be omitted from the json or nbt.
|-
+
* <code>someCodec.optionalFieldOf("field_name", someDefaultValue)</code> creates an optional field that deserializes as the given default value if the field is not present in the json/nbt. When serializing, if the field in the java object equals the default value, the value will not be serialized and the field will be omitted from the json or nbt.
| style="width: 52px;" data-mce-style="width: 52px;"|intRange
+
 
| style="width: 640px;" data-mce-style="width: 640px;"|Creates a integer codec with a valid inclusive range.
+
When using optional fields, be wary that if the field contains bad data or otherwise fails to serialize, the error will be silently caught, and the field will serialize as the default value instead!
 +
 
 +
==Unit==
 +
The <code>Codec.unit(defaultValue)</code> codec creates a Codec that always deserializes a specified default value, regardless of input. When serializing, it serializes nothing.
 +
 
 +
==Pair==
 +
The <code>Codec.pair(codecA, codecB)</code> static method takes two codecs and generates a Codec<Pair<A,B>> from them.
 +
 
 +
Using this function requires that codecA be a codec that serializes to a string.
 +
 
 +
==Either==
 +
The <code>Codec.either(codecA, codecB)</code> static method takes two codecs and generates a Codec<Either<A,B>> from them.
 +
 
 +
When this codec is used to de/serialize an object, it first attempts to use the first codec; if and only if that conversion fails, it then attempts to use the second codec. If that conversion also fails, then the returned DataResult will contain the error data from the *second* codec's conversion attempt.
 +
 
 +
==Numeric Ranges==
 +
The <code>intRange(min,max)</code>, <code>floatRange(min,max)</code>, and <code>doubleRange(min,max)</code> static methods in Codec generate Codecs for Integers, Floats, or Doubles, respectively, for which only a specified inclusive range is valid, and values outside that range will fail to de/serialize.
 +
 
 +
==Maps==
 +
Suppose we want to serialize a HashMap or other Map type, where we could have indefinitely many key-value pairs and we don't know what the keys are ahead of time.
 +
 
 +
We can create a <code>Codec<Map<KEY,VALUE>></code> using the <code>Codec.unboundedMap</code> static method, which takes a key codec and a value codec and creates a codec for a map type:
 +
 
 +
<syntaxhighlight lang="java">
 +
public static final Codec<Map<String, BlockPos>> = Codec.unboundedMap(Codec.STRING, BlockPos.CODEC);
 +
</syntaxhighlight>
 +
 
 +
The serialized form of maps serialized by this codec will be a JsonObject or CompoundNBT, whose fields are the key-value pairs in the map.
 +
 
 +
A limitation of using unboundedMap is that it only supports key codecs that serialize to Strings (including codecs for things like ResourceLocation that aren't Strings themselves but still serialize to strings). To create a codec for a Map whose keys are not fundamentally strings, the Map must be serialized as a list of key-value entries instead of using unboundedMap. As Codec.pair() also requires that the first codec serialize to a string, the pair function cannot be used for this purpose either.
 +
 
 +
==Equivalent Types and xmap==
 +
Suppose we have two java classes, Amalgam and Box; any Amalgam instance can be converted to a Box, and vice-versa. Now suppose we have a Codec<Amalgam>, but we'd also like to have a Codec<Box>. Rather than creating an entirely new codec for Box from scratch, we can simply xmap our Amalgam codec instead.
 +
 
 +
The <code>Codec#xmap</code> instance method is used to generate a second codec for a fundamentally equivalent type to the first codec's type. The method takes two function objects as arguments, which are used to convert the first type to the second when deserializing, and converting the second type to the first when serializing:
 +
 
 +
<syntaxhighlight lang="java">
 +
public static final Codec<Box> = Amalgam.CODEC.xmap(Amalgam::toBox, Box::toAmalgam);
 +
</syntaxhighlight>
 +
 
 +
Codecs created in this manner will serialize objects in the same format as the starting codec.
 +
 
 +
==Partially Equivalent Types, flatComapMap, comapFlatMap, and flatXMap==
 +
Consider the ResourceLocation: Any ResourceLocation can be converted to a String, but not all Strings can be converted to a ResourceLocation; ResourceLocations have strict limits on their format and allowed characters.
 +
 
 +
While we *could* use xmap to convert the Codec.STRING to a codec for ResourceLocations, this would cause attempts to parse an invalid string like <code>SHOUTY:MOD:Invalid$Characters</code> to throw a runtime exception, when we really should be returning a failed DataResult to the parser instead -- which indeed is what the vanilla ResourceLocation codec does.
 +
 
 +
Codecs have three additional instance methods for creating equivalent codecs for when we can *potentially* convert one type to another, but are not guaranteed to be able to do so. These take conversion function arguments that return DataResults, allowing validation to be performed during serialization and deserialization.
 +
 
 +
{| class="wikitable"
 +
|+ Codec Conversion Methods
 
|-
 
|-
| style="width: 52px;" data-mce-style="width: 52px;"|floatRange
+
! Can A always be converted to B? !! Can B always be converted to A? !! Which method of codecA should be used to create codecB?
| style="width: 640px;" data-mce-style="width: 640px;"|Creates a float codec with a valid inclusive range.
   
|-
 
|-
| style="width: 52px;" data-mce-style="width: 52px;"|doubleRange
+
| yes || yes || codecA.xmap
| style="width: 640px;" data-mce-style="width: 640px;"|Creates a double codec with a valid inclusive range.
   
|-
 
|-
| style="width: 52px;" data-mce-style="width: 52px;"|pair
+
| yes || no || codecA.flatComapMap
| style="width: 640px;" data-mce-style="width: 640px;"|Create a pair using two codecs.
   
|-
 
|-
| style="width: 52px;" data-mce-style="width: 52px;"|either
+
| no || yes || codecA.comapFlatMap
| style="width: 640px;" data-mce-style="width: 640px;"|Creates an either (an object with some fallback object) using two codecs.
   
|-
 
|-
| style="width: 52px;" data-mce-style="width: 52px;"|unboundedMap
+
| no || no || codecA.flatXmap
| style="width: 640px;" data-mce-style="width: 640px;"|Creates a map using two codecs.
   
|}
 
|}
Last, there is a Item. This is optional and should default to <code>Items#AIR</code> when not defined. Here, there will be two techniques used to grab the associated codec. By default, a <code>Registry</code> is an instance of a codec. Therefore, the codec can be grabbed by specifying the registry instance (e.g. <code>Registry#ITEM</code>). What if there is no vanilla registry instance however? Then, another codec method can be used: <code>Codec#xmap</code>. This allows the associated object to be mapped to another object. A function specifies mapping the associated object to the new object for decoding and vice versa for encoding. For example, the <code>ResourceLocation</code> codec can be mapped to an <code>Item</code> through the forge registry instance.
     −
To define a field as optional, <code>Codec#optionalFieldOf</code> should be used. One instance holds the value as an <code>Optional</code> while the other allows a defined default value.
+
== Registry Dispatch ==
<syntaxhighlight lang="java">
+
Registry Dispatch Codecs allow us to define a registry of codecs and delegate to a specific codec to deserialize a particular json based on a type field in that json. Dispatch codecs are used extensively when deserializing worldgen data.
public static final Codec<ExampleCodecClass> CODEC = RecordCodecBuilder.create(builder -> {
+
 
    return builder.group(Codec.INT.fieldOf("field_1").forGetter(obj -> obj.field_1),
+
To create a dispatch codec for a Thing class, the following steps can be performed:
      BlockPos.CODEC.listOf().fieldOf("field_2").forGetter(obj -> obj.field_2),
+
# Create a Thing abstract class class and ThingType interface. The ThingType interface should have a method that supplies a Codec<Thing>, while Thing subclasses must define a method that supplies a ThingType.
      ResourceLocation.CODEC.xmap(loc -> ForgeRegistries.ITEMS.getValue(loc), item -> item.getRegistryName()).optionalFieldOf("field_3", Items.AIR).forGetter(obj -> obj.field_3))...;
+
# Create a map or registry of ThingTypes, and register a ThingType for each sub-codec we want to have.
});
+
# Create a Codec<ThingType>, or have the ThingType registry implement Codec.
 +
# Create our Codec<Thing> master codec by invoking <code>Codec#dispatch</code> on our ThingType codec. This method's arguments are:
 +
## A field name for the ID of the sub-codec (the example json below is using "type")
 +
## A function to retrieve a ThingType from a Thing
 +
## A function to retrieve a Codec<Thing> from a ThingType
 +
 
 +
We can then use our Codec<Thing> to create Thing fields in other codecs whose serialized format depends on the specific sub-codec used by a Thing instance.
 +
 
 +
As an example of this, consider the ExampleCodecClass earlier. Suppose we make this class extend Thing and register our codec for it to a codec dispatch registry with the id "ourmod:exampleclass". If we were to define an instance of this class in a Thing field in some json, it would look like
 +
 
 +
<syntaxhighlight lang="json">
 +
"some_thing":
 +
{
 +
"type": "ourmod:exampleclass",
 +
"some_int": 42,
 +
"item": "minecraft:gold_ingot",
 +
"block_positions":
 +
[
 +
[0,0,0],
 +
[10,20,-100]
 +
]
 +
}
 
</syntaxhighlight>
 
</syntaxhighlight>
Now, there is a product <code>P3</code> as there is three parameters. To convert this into an <code>App</code>, the method <code>P#apply</code> should be called. This takes in an <code>Applicative</code> which our builder is an instance of and a function that returns the class object given the specified wrapped arguments. Creating a new constructor is one way of creating the outputted codec for the class object.
  −
<syntaxhighlight lang="java">
  −
public static final Codec<ExampleCodecClass> CODEC = RecordCodecBuilder.create(builder -> {
  −
    return builder.group(Codec.INT.fieldOf("field_1").forGetter(obj -> obj.field_1),
  −
      BlockPos.CODEC.listOf().fieldOf("field_2").forGetter(obj -> obj.field_2),
  −
      ResourceLocation.CODEC.xmap(loc -> ForgeRegistries.ITEMS.getValue(loc), item -> item.getRegistryName()).optionalFieldOf("field_3", Items.AIR).forGetter(obj -> obj.field_3))
  −
      .apply(builder, ExampleCodecClass::new);
  −
});
  −
</syntaxhighlight>
  −
==Limitations==
  −
Codecs are required to abide by a String-Object key-pair. Any codec that does not have a String key will throw an error during encoding and decoding.
     −
A group can have at most 16 inner codecs normally. This limitation is specified by the number of product generic classes available.
+
Other ThingTypes we register would have different fields in this json object, but would still be valid for the "some_thing" field.
 +
 
 +
Several examples of vanilla classes that use dispatch codecs:
 +
* RuleTest and IRuleTestType
 +
* IParticleData and ParticleType
 +
* ConfiguredPlacement and Placement
 +
 
 +
=External Links=
 +
* [https://github.com/Mojang/DataFixerUpper/blob/master/src/main/java/com/mojang/serialization/Codec.java Codecs in Mojang's official public DataFixerUpper repository]
 +
* [https://kvverti.github.io/Documented-DataFixerUpper/snapshot/com/mojang/serialization/Codec.html#flatXmap-java.util.function.Function-java.util.function.Function- Unofficial Codec Javadocs]
22

edits