Udon Script 分析
我想知道一些 VRChat 地图的脚本逻辑, 但是 VRChat 地图的脚本都被编译成了一些神秘的, 无法被 AssetRipper 轻易解析的 MonoBehaviour.
我不太了解 VRChat 的世界创作生态, 但是朋友告诉我这是 Udon Script, 还告诉我他也没法解读这些产物.
既然如此, Challenge Accepted!
Udon Program
Udon Script 的编译产物.
资产
使用 AssetRipper 解包地图后, 能得到大量 AssetRipper 无法正确解析的 MonoBehaviour 资产文件. 其中一些 MonoBehaviour 资产包含一个很长的 serializedProgramCompressedBytes. 这代表这个资产是一个 Udon Script 的编译产物.
serializedProgramCompressedBytes 是一个十六进制字符串, 是 GZip 压缩后的 Udon Program 序列化结果.
Udon Program 的反序列化
serializedProgramCompressedBytes 经过 GZip 解压后得到的二进制文件是 UdonProgram 实例序列化后的结果.
这个序列化过程使用的是一个 VRChat 修改的 OdinSerializer. 所以我们可以直接用这个序列化器对应的反序列化器进行反序列化. 一些关键代码如下
using System.IO;
using VRC.Udon.Common;
using VRC.Udon.Serialization.OdinSerializer;
using var memoryStream = new MemoryStream(fileData);
var context = new DeserializationContext();
var reader = new BinaryDataReader(memoryStream, context);
UdonProgram program =
VRC.Udon.Serialization.OdinSerializer.SerializationUtility
.DeserializeValue<UdonProgram>(reader);
using System.IO;
using VRC.Udon.Common;
using VRC.Udon.Serialization.OdinSerializer;
using var memoryStream = new MemoryStream(fileData);
var context = new DeserializationContext();
var reader = new BinaryDataReader(memoryStream, context);
UdonProgram program =
VRC.Udon.Serialization.OdinSerializer.SerializationUtility
.DeserializeValue<UdonProgram>(reader);
using System.IO;
using VRC.Udon.Common;
using VRC.Udon.Serialization.OdinSerializer;
using var memoryStream = new MemoryStream(fileData);
var context = new DeserializationContext();
var reader = new BinaryDataReader(memoryStream, context);
UdonProgram program =
VRC.Udon.Serialization.OdinSerializer.SerializationUtility
.DeserializeValue<UdonProgram>(reader);
using System.IO;
using VRC.Udon.Common;
using VRC.Udon.Serialization.OdinSerializer;
using var memoryStream = new MemoryStream(fileData);
var context = new DeserializationContext();
var reader = new BinaryDataReader(memoryStream, context);
UdonProgram program =
VRC.Udon.Serialization.OdinSerializer.SerializationUtility
.DeserializeValue<UdonProgram>(reader);
UdonProgram 类
UdonProgram 类中几乎有我们需要的一切. 下面是一个简化¹的类定义
public class UdonProgram : IUdonProgram
{
public string InstructionSetIdentifier { get; }
public int InstructionSetVersion { get; }
public byte[] ByteCode { get; }
public IUdonHeap Heap { get; }
public IUdonSymbolTable EntryPoints { get; }
public IUdonSymbolTable SymbolTable { get; }
public IUdonSyncMetadataTable SyncMetadataTable { get; }
public int UpdateOrder { get; }
}
public class UdonProgram : IUdonProgram
{
public string InstructionSetIdentifier { get; }
public int InstructionSetVersion { get; }
public byte[] ByteCode { get; }
public IUdonHeap Heap { get; }
public IUdonSymbolTable EntryPoints { get; }
public IUdonSymbolTable SymbolTable { get; }
public IUdonSyncMetadataTable SyncMetadataTable { get; }
public int UpdateOrder { get; }
}
public class UdonProgram : IUdonProgram
{
public string InstructionSetIdentifier { get; }
public int InstructionSetVersion { get; }
public byte[] ByteCode { get; }
public IUdonHeap Heap { get; }
public IUdonSymbolTable EntryPoints { get; }
public IUdonSymbolTable SymbolTable { get; }
public IUdonSyncMetadataTable SyncMetadataTable { get; }
public int UpdateOrder { get; }
}
public class UdonProgram : IUdonProgram
{
public string InstructionSetIdentifier { get; }
public int InstructionSetVersion { get; }
public byte[] ByteCode { get; }
public IUdonHeap Heap { get; }
public IUdonSymbolTable EntryPoints { get; }
public IUdonSymbolTable SymbolTable { get; }
public IUdonSyncMetadataTable SyncMetadataTable { get; }
public int UpdateOrder { get; }
}
我们比较关心 ByteCode, Heap, EntryPoints, SymbolTable 这几个字段.
Udon 字节码和指令集
是一系列大端序 u32 组成的指令的序列.
指令格式为 OPCODE[OPERAND], 两部分各 4 字节, OPERAND 是一个大端序 u32.
OPCODE 包括无参数的 NOP, POP, COPY 和有一个参数的 PUSH, JUMP_IF_FALSE, JUMP, EXTERN, ANNOTATION, JUMP_INDIRECT.
各 OPCODE 对应的值为
class OpCode(IntEnum):
NOP = 0
PUSH = 1
POP = 2
JUMP_IF_FALSE = 4
JUMP = 5
EXTERN = 6
ANNOTATION = 7
JUMP_INDIRECT = 8
COPY = 9
class OpCode(IntEnum):
NOP = 0
PUSH = 1
POP = 2
JUMP_IF_FALSE = 4
JUMP = 5
EXTERN = 6
ANNOTATION = 7
JUMP_INDIRECT = 8
COPY = 9
class OpCode(IntEnum):
NOP = 0
PUSH = 1
POP = 2
JUMP_IF_FALSE = 4
JUMP = 5
EXTERN = 6
ANNOTATION = 7
JUMP_INDIRECT = 8
COPY = 9
class OpCode(IntEnum):
NOP = 0
PUSH = 1
POP = 2
JUMP_IF_FALSE = 4
JUMP = 5
EXTERN = 6
ANNOTATION = 7
JUMP_INDIRECT = 8
COPY = 9
各 OPCODE 和 OPERAND 含义如下:
NOP: 空指令PUSH I: 将立即数I压栈POP: 从栈中弹出一个值并丢弃COPY: 复制堆中的值JUMP_IF_FALSE ADDR: 条件跳转到ADDRJUMP ADDR: 无条件跳转到ADDREXTERN F: 调用外部函数,F是堆中的函数签名string或者函数委托UdonExternDelegate的地址ANNOTATION: 注解, 执行时跳过JUMP_INDIRECT IADDR: 间接跳转到IADDR作为堆地址指向的值
堆
用于存储 Udon VM 执行该 Udon Program 时堆的初始值, 相当于常量段.
简化的类定义如下
[Serializable]
public sealed class UdonHeap : IUdonHeap, ISerializable
{
[NonSerialized]
private readonly IStrongBox[] _heap;
[NonSerialized]
private readonly Dictionary<Type, Type>
_strongBoxOfTypeCache = new Dictionary<Type, Type>();
[NonSerialized]
private readonly Dictionary<Type, Type>
_strongBoxOfTContainedTypeCache = new Dictionary<Type, Type>();
public void GetObjectData(
SerializationInfo info, StreamingContext context
)
{
List<ValueTuple<uint, IStrongBox, Type>> list =
new List<ValueTuple<uint, IStrongBox, Type>>();
this.DumpHeapObjects(list);
info.AddValue("HeapCapacity", Math.Max(0, this._heap.Length));
info.AddValue("HeapDump", list);
}
public void DumpHeapObjects(
List<ValueTuple<uint, IStrongBox, Type>> destination
)
{
uint num = 0;
while (num < this._heap.Length)
{
IStrongBox strongBox = this._heap[num];
if (strongBox != null)
{
destination.Add(new ValueTuple<uint, IStrongBox, Type>(
num,
strongBox,
strongBox.GetType().GenericTypeArguments[0]
));
}
num += 1;
}
}
}
[Serializable]
public sealed class UdonHeap : IUdonHeap, ISerializable
{
[NonSerialized]
private readonly IStrongBox[] _heap;
[NonSerialized]
private readonly Dictionary<Type, Type>
_strongBoxOfTypeCache = new Dictionary<Type, Type>();
[NonSerialized]
private readonly Dictionary<Type, Type>
_strongBoxOfTContainedTypeCache = new Dictionary<Type, Type>();
public void GetObjectData(
SerializationInfo info, StreamingContext context
)
{
List<ValueTuple<uint, IStrongBox, Type>> list =
new List<ValueTuple<uint, IStrongBox, Type>>();
this.DumpHeapObjects(list);
info.AddValue("HeapCapacity", Math.Max(0, this._heap.Length));
info.AddValue("HeapDump", list);
}
public void DumpHeapObjects(
List<ValueTuple<uint, IStrongBox, Type>> destination
)
{
uint num = 0;
while (num < this._heap.Length)
{
IStrongBox strongBox = this._heap[num];
if (strongBox != null)
{
destination.Add(new ValueTuple<uint, IStrongBox, Type>(
num,
strongBox,
strongBox.GetType().GenericTypeArguments[0]
));
}
num += 1;
}
}
}
[Serializable]
public sealed class UdonHeap : IUdonHeap, ISerializable
{
[NonSerialized]
private readonly IStrongBox[] _heap;
[NonSerialized]
private readonly Dictionary<Type, Type>
_strongBoxOfTypeCache = new Dictionary<Type, Type>();
[NonSerialized]
private readonly Dictionary<Type, Type>
_strongBoxOfTContainedTypeCache = new Dictionary<Type, Type>();
public void GetObjectData(
SerializationInfo info, StreamingContext context
)
{
List<ValueTuple<uint, IStrongBox, Type>> list =
new List<ValueTuple<uint, IStrongBox, Type>>();
this.DumpHeapObjects(list);
info.AddValue("HeapCapacity", Math.Max(0, this._heap.Length));
info.AddValue("HeapDump", list);
}
public void DumpHeapObjects(
List<ValueTuple<uint, IStrongBox, Type>> destination
)
{
uint num = 0;
while (num < this._heap.Length)
{
IStrongBox strongBox = this._heap[num];
if (strongBox != null)
{
destination.Add(new ValueTuple<uint, IStrongBox, Type>(
num,
strongBox,
strongBox.GetType().GenericTypeArguments[0]
));
}
num += 1;
}
}
}
[Serializable]
public sealed class UdonHeap : IUdonHeap, ISerializable
{
[NonSerialized]
private readonly IStrongBox[] _heap;
[NonSerialized]
private readonly Dictionary<Type, Type>
_strongBoxOfTypeCache = new Dictionary<Type, Type>();
[NonSerialized]
private readonly Dictionary<Type, Type>
_strongBoxOfTContainedTypeCache = new Dictionary<Type, Type>();
public void GetObjectData(
SerializationInfo info, StreamingContext context
)
{
List<ValueTuple<uint, IStrongBox, Type>> list =
new List<ValueTuple<uint, IStrongBox, Type>>();
this.DumpHeapObjects(list);
info.AddValue("HeapCapacity", Math.Max(0, this._heap.Length));
info.AddValue("HeapDump", list);
}
public void DumpHeapObjects(
List<ValueTuple<uint, IStrongBox, Type>> destination
)
{
uint num = 0;
while (num < this._heap.Length)
{
IStrongBox strongBox = this._heap[num];
if (strongBox != null)
{
destination.Add(new ValueTuple<uint, IStrongBox, Type>(
num,
strongBox,
strongBox.GetType().GenericTypeArguments[0]
));
}
num += 1;
}
}
}
我们感兴趣的就是其中的 HeapDump, 这是一个 (Addr, Value, Type) 三元组的列表.
入口点表
实际上是函数表.
简化的类定义如下
[Serializable]
public sealed class UdonSymbolTable : IUdonSymbolTable, ISerializable
{
private readonly ImmutableArray<string> _exportedSymbols;
private readonly ImmutableDictionary<string, IUdonSymbol> _nameToSymbol;
void ISerializable.GetObjectData(
SerializationInfo info, StreamingContext context
)
{
info.AddValue(
"Symbols",
this._nameToSymbol.Values.ToList<IUdonSymbol>()
);
info.AddValue(
"ExportedSymbols",
this._exportedSymbols.ToList<string>()
);
}
}
[Serializable]
public sealed class UdonSymbol : IUdonSymbol, ISerializable
{
public string Name { get; }
public Type Type { get; }
public uint Address { get; }
void ISerializable.GetObjectData(
SerializationInfo info, StreamingContext context
)
{
info.AddValue("Name", this.Name);
info.AddValue("Type", this.Type);
info.AddValue("Address", this.Address);
}
}
[Serializable]
public sealed class UdonSymbolTable : IUdonSymbolTable, ISerializable
{
private readonly ImmutableArray<string> _exportedSymbols;
private readonly ImmutableDictionary<string, IUdonSymbol> _nameToSymbol;
void ISerializable.GetObjectData(
SerializationInfo info, StreamingContext context
)
{
info.AddValue(
"Symbols",
this._nameToSymbol.Values.ToList<IUdonSymbol>()
);
info.AddValue(
"ExportedSymbols",
this._exportedSymbols.ToList<string>()
);
}
}
[Serializable]
public sealed class UdonSymbol : IUdonSymbol, ISerializable
{
public string Name { get; }
public Type Type { get; }
public uint Address { get; }
void ISerializable.GetObjectData(
SerializationInfo info, StreamingContext context
)
{
info.AddValue("Name", this.Name);
info.AddValue("Type", this.Type);
info.AddValue("Address", this.Address);
}
}
[Serializable]
public sealed class UdonSymbolTable : IUdonSymbolTable, ISerializable
{
private readonly ImmutableArray<string> _exportedSymbols;
private readonly ImmutableDictionary<string, IUdonSymbol> _nameToSymbol;
void ISerializable.GetObjectData(
SerializationInfo info, StreamingContext context
)
{
info.AddValue(
"Symbols",
this._nameToSymbol.Values.ToList<IUdonSymbol>()
);
info.AddValue(
"ExportedSymbols",
this._exportedSymbols.ToList<string>()
);
}
}
[Serializable]
public sealed class UdonSymbol : IUdonSymbol, ISerializable
{
public string Name { get; }
public Type Type { get; }
public uint Address { get; }
void ISerializable.GetObjectData(
SerializationInfo info, StreamingContext context
)
{
info.AddValue("Name", this.Name);
info.AddValue("Type", this.Type);
info.AddValue("Address", this.Address);
}
}
[Serializable]
public sealed class UdonSymbolTable : IUdonSymbolTable, ISerializable
{
private readonly ImmutableArray<string> _exportedSymbols;
private readonly ImmutableDictionary<string, IUdonSymbol> _nameToSymbol;
void ISerializable.GetObjectData(
SerializationInfo info, StreamingContext context
)
{
info.AddValue(
"Symbols",
this._nameToSymbol.Values.ToList<IUdonSymbol>()
);
info.AddValue(
"ExportedSymbols",
this._exportedSymbols.ToList<string>()
);
}
}
[Serializable]
public sealed class UdonSymbol : IUdonSymbol, ISerializable
{
public string Name { get; }
public Type Type { get; }
public uint Address { get; }
void ISerializable.GetObjectData(
SerializationInfo info, StreamingContext context
)
{
info.AddValue("Name", this.Name);
info.AddValue("Type", this.Type);
info.AddValue("Address", this.Address);
}
}
这里每个 UdonSymbol 里的
Name是函数名Address是该函数的首条指令在UdonProgram.ByteCode中的索引Type无意义
这给我们带来了很多方便.
符号表
类定义和入口点表相同, 其中每个 UdonSymbol 里的
Name是符号名Address是该符号在堆中的地址Type是符号类型
Udon VM
是一个简单的栈式虚拟机.
堆, 栈和寄存器
- 堆: 是一个
IStrongBox[], 地址就是数组索引, 使用程序中的常量段初始化 - 栈: 一个
u32栈 - PC: 单位是字节
外部函数
Udon VM 的外部函数委托是 UdonExternDelegate, 具体定义为
delegate void UdonExternDelegate(IUdonHeap heap, Span<uint> parameterAddresses);
delegate void UdonExternDelegate(IUdonHeap heap, Span<uint> parameterAddresses);
delegate void UdonExternDelegate(IUdonHeap heap, Span<uint> parameterAddresses);
delegate void UdonExternDelegate(IUdonHeap heap, Span<uint> parameterAddresses);
也即传入
- 堆用于获取参数和写入结果
- 一系列参数地址(在堆中的)用于获取参数
在此基础上封装了 CachedUdonExternDelegate, 具体定义为
class CachedUdonExternDelegate
{
public readonly string externSignature;
public readonly UdonExternDelegate externDelegate;
public readonly int parameterCount;
}
class CachedUdonExternDelegate
{
public readonly string externSignature;
public readonly UdonExternDelegate externDelegate;
public readonly int parameterCount;
}
class CachedUdonExternDelegate
{
public readonly string externSignature;
public readonly UdonExternDelegate externDelegate;
public readonly int parameterCount;
}
class CachedUdonExternDelegate
{
public readonly string externSignature;
public readonly UdonExternDelegate externDelegate;
public readonly int parameterCount;
}
CachedUdonExternDelegate 可以完全通过一个 string 获取, 也即 externSignature.
这个 externSignature 其实就是简单的函数签名, 如
ExternVRCEconomyIProduct.__Equals__VRCEconomyIProduct__SystemBoolean
ExternVRCEconomyIProduct.__get_Buyer__VRCSDKBaseVRCPlayerApi
ExternVRCEconomyIProduct.__get_Description__SystemString
ExternVRCEconomyIProduct.__get_ID__SystemString
ExternVRCEconomyIProduct.__get_Name__SystemString
ExternVRCEconomyIProduct.__Equals__VRCEconomyIProduct__SystemBoolean
ExternVRCEconomyIProduct.__get_Buyer__VRCSDKBaseVRCPlayerApi
ExternVRCEconomyIProduct.__get_Description__SystemString
ExternVRCEconomyIProduct.__get_ID__SystemString
ExternVRCEconomyIProduct.__get_Name__SystemString
ExternVRCEconomyIProduct.__Equals__VRCEconomyIProduct__SystemBoolean
ExternVRCEconomyIProduct.__get_Buyer__VRCSDKBaseVRCPlayerApi
ExternVRCEconomyIProduct.__get_Description__SystemString
ExternVRCEconomyIProduct.__get_ID__SystemString
ExternVRCEconomyIProduct.__get_Name__SystemString
ExternVRCEconomyIProduct.__Equals__VRCEconomyIProduct__SystemBoolean
ExternVRCEconomyIProduct.__get_Buyer__VRCSDKBaseVRCPlayerApi
ExternVRCEconomyIProduct.__get_Description__SystemString
ExternVRCEconomyIProduct.__get_ID__SystemString
ExternVRCEconomyIProduct.__get_Name__SystemString
这个签名由两部分组成, 分别是 ModuleName 和 FuncSignature. 类(也即 Module)通过实现 IUdonWrapperModule, 将自己的 ModuleName 和所有 FuncSignature 及其对应的参数数量注册到 UdonWrapper 中, 供其使用完整的 externSignature 获取.
执行过程
读取当前 PC 处的指令
NOP: PC 步进 4 字节PUSH: 把OPERAND作为立即数压栈, PC 步进 8 字节POP: 弹栈, 丢弃栈顶值, PC 步进 4 字节-
JUMP_IF_FALSE: 栈顶是堆地址, 弹栈, 读该地址对应的堆元素(bool)的值- 若为
true, PC 步进 8 字节 - 若为
false, 设置 PC 为OPERAND
- 若为
JUMP: 设置 PC 为OPERAND-
EXTERN: 调用外部函数. 尝试读取OPERAND作为堆地址指向的对象- 若为
string, 通过 UdonWrapper 获取该string对应的CachedUdonExternDelegate - 若为
CachedUdonExternDelegate, 也得到了CachedUdonExternDelegate
从栈中连续弹出
CachedUdonExternDelegate.parameterCount个参数地址, 按与弹栈相反的顺序(也即最初的栈顶为最后一个地址)组装成Span<uint> parameterAddresses, 并调用UdonExternDelegate. PC 步进 8 字节 - 若为
ANNOTATION: PC 步进 8 字节JUMP_INDIRECT: 设置 PC 为OPERAND作为堆地址指向的u32值COPY: 从栈中先后弹出TARGET和SOURCE两个地址, 然后把堆中TARGET地址指向的值使用SOURCE地址指向的值覆盖. 所在 PC 步进 4 字节
反编译
用 Claude 写了一个反编译器. 比之前的反汇编器和反编译器效果好一些, 但是能做的还有很多.
-
本小节中出现的类定义只列出了进入序列化后的 Udon Program 二进制的部分.