Files
test/.cursor/rules/convert-data-rule.mdc
T
2025-11-01 19:21:48 +07:00

310 lines
9.8 KiB
Plaintext

---
description: Binary data conversion rules when reading C++ binary files in C#
---
# Binary Data Conversion Rules (C++ to C#)
When reading binary data from C++ files using `FileStream`, follow these conversion patterns based on the variable type in the original C++ struct.
## Important Type Mappings
- `unsigned long` in C++ → `uint` in C#
- `bool` in C++ → 1 byte (read as `byte` in C#, then convert with `> 0`)
- `wchar_t` on Windows → 2 bytes
- `task_char` → `ushort`
- Pointer size in C++ binary → typically 4 bytes (32-bit)
**Critical**: When reading binary data, match the original C++ memory layout exactly, accounting for struct packing and alignment.
## Case 1: Normal Value Type Variables (Primitives)
**Pattern**: Directly stored scalar values (no arrays, no pointers, no user-defined structs)
**C++ Examples**:
```cpp
unsigned long m_ID;
bool m_bHasSign;
int m_lAvailFrequency;
float m_fLibraryTasksProbability;
```
**C# Conversion**:
```csharp
// Read value and assign to the field
fixedData.m_ID = AAssit.ReadFromBinaryOf<uint>(fp, ref readBytes);
// Special case for bool: read as byte and compare with > 0
fixedData.m_bHasSign = AAssit.ReadFromBinaryOf<byte>(fp, ref readBytes) > 0;
fixedData.m_bChooseOne = AAssit.ReadFromBinaryOf<byte>(fp, ref readBytes) > 0;
// Other primitive types
fixedData.m_lAvailFrequency = AAssit.ReadFromBinaryOf<int>(fp, ref readBytes);
fixedData.m_fLibraryTasksProbability = AAssit.ReadFromBinaryOf<float>(fp, ref readBytes);
```
**Rule**: Use `AAssit.ReadFromBinaryOf<T>(fp, ref readBytes)` where T is the C# equivalent type.
**Important for bool**: C++ bool is stored as 1 byte in binary files. Always read as `byte` and convert to C# bool using `> 0`:
```csharp
// Correct pattern for bool
fixedData.m_boolField = AAssit.ReadFromBinaryOf<byte>(fp, ref readBytes) > 0;
```
---
## Case 2: Fixed-Size Arrays (Inline Memory)
**Pattern**: Arrays with a known constant size defined by a macro or constant
**C++ Examples**:
```cpp
task_char m_szName[MAX_TASK_NAME_LEN]; // MAX_TASK_NAME_LEN = 30
char m_tmType[MAX_TIMETABLE_SIZE]; // MAX_TIMETABLE_SIZE = 32
task_tm m_tmStart[MAX_TIMETABLE_SIZE];
task_tm m_tmEnd[MAX_TIMETABLE_SIZE];
```
**C# Conversion**:
```csharp
// Read array from binary and assign to the field
fixedData.m_szName = AAssit.ReadArrayFromBinary<ushort>(fp, TaskTemplConstants.MAX_TASK_NAME_LEN, ref readBytes);
fixedData.m_tmType = AAssit.ReadArrayFromBinary<byte>(fp, TaskTemplConstants.MAX_TIMETABLE_SIZE, ref readBytes);
fixedData.m_tmStart = AAssit.ReadArrayFromBinary<task_tm>(fp, TaskTemplConstants.MAX_TIMETABLE_SIZE, ref readBytes);
fixedData.m_tmEnd = AAssit.ReadArrayFromBinary<task_tm>(fp, TaskTemplConstants.MAX_TIMETABLE_SIZE, ref readBytes);
```
**Rule**: Use `AAssit.ReadArrayFromBinary<T>(fp, arraySize, ref readBytes)` where:
- `T` is the C# equivalent element type
- `arraySize` is the predefined constant size
---
## Case 3: Pointer to User-Defined Type or String
**Pattern**: Fields that store only the pointer address, not the actual object/data
**C++ Examples**:
```cpp
task_char* m_pszSignature;
AWARD_DATA* m_Award_S;
Task_Region* m_pDelvRegion;
ITEM_WANTED* m_PremItems;
```
**C# Conversion**:
```csharp
// Skip pointer size (4 bytes for 32-bit pointers)
// The content is not inlined in the struct
fp.Seek(4, SeekOrigin.Current);
```
**Rule**: Skip 4 bytes using `fp.Seek(4, SeekOrigin.Current)` because only the pointer address is stored in the binary file, not the actual data.
**Note**: If the pointed-to data exists separately in the file, it must be read later based on additional logic (e.g., checking a count variable or flag).
---
## Case 4: User-Defined Struct (Inline, Not Pointer)
**Pattern**: A complete user-defined struct embedded directly in the parent struct
**C++ Examples**:
```cpp
task_tm m_tmAbsFailTime; // Inline struct
AWARD_DATA m_Award_S; // Inline struct (if not a pointer)
```
**C# Conversion**:
```csharp
// Skip the full size of the struct in bytes
// Check the struct's internal members and padding to calculate correct size
fp.Seek(24, SeekOrigin.Current); // Example: sizeof(task_tm) = 24 bytes on Windows/MSVC
```
**Rule**:
1. Calculate the exact size of the struct including padding
2. Skip that many bytes using `fp.Seek(structSize, SeekOrigin.Current)`
**Important**: Do not rely on hardcoded numbers. Calculate the actual struct size considering:
- Size of each member
- Struct packing (`Pack = 1`, `Pack = 4`, etc.)
- Platform-specific alignment rules
**Alternative**: If you need the data, read the struct directly:
```csharp
fixedData.m_tmAbsFailTime = AAssit.ReadFromBinaryOf<task_tm>(fp, ref readBytes);
```
---
## Case 5: Pointer to Basic Type (Not Inline)
**Pattern**: Pointer to a basic/primitive type where only the pointer address is stored in the struct. The pointed-to content (if any) is stored elsewhere in the file and must be read later based on counts/flags.
**C++ Examples**:
```cpp
ushort* m_pszSignature; // Pointer to ushort
int* m_plChangeKey; // Pointer to int
bool* m_pbChangeType; // Pointer to bool
float* m_pFloatArray; // Pointer to float
```
**C# Conversion**:
```csharp
// Always skip the pointer address (4 bytes in the binary layout)
fp.Seek(4, SeekOrigin.Current);
```
**Rule**: For any pointer (to basic or user-defined types), always skip 4 bytes to account for the stored pointer address in the C++ binary. The actual data, if present, should be read later using accompanying count/flag fields.
**Note**: This assumes the source binary was written with 32-bit pointer sizes. If you ever process binaries with 64-bit pointer sizes, adjust accordingly.
---
## Decision Tree
Use this flowchart to determine which case applies:
```
Is it a pointer (has * in C++)?
├── YES
│ ├── Is it a user-defined type (struct/class)?
│ │ ├── YES → Case 3: Skip 4 bytes (pointer address)
│ │ └── NO (basic type) → Case 5: Skip 4 bytes (pointer address)
│ └── NO
│ ├── Is it an array with predefined size?
│ │ ├── YES → Case 2: Read array using ReadArrayFromBinary
│ │ └── NO
│ │ ├── Is it a user-defined struct (inline)?
│ │ │ ├── YES → Case 4: Skip sizeof(struct) or read struct
│ │ │ └── NO → Case 1: Read value using ReadFromBinaryOf
```
---
## Common Patterns and Examples
### Pattern: Conditional Data Reading
When a field has associated data only if a flag/count is set:
**C++ Example**:
```cpp
bool m_bHasSign;
task_char* m_pszSignature; // Only has data if m_bHasSign is true
unsigned long m_ulTimetable;
task_tm* m_tmStart; // Only has data if m_ulTimetable > 0
```
**C# Conversion**:
```csharp
// First read the flag/count (read bool as byte and convert with > 0)
fixedData.m_bHasSign = AAssit.ReadFromBinaryOf<byte>(fp, ref readBytes) > 0;
// Skip the pointer (Case 3)
fp.Seek(4, SeekOrigin.Current);
// Later, conditionally read the actual data
if (fixedData.m_bHasSign)
{
fixedData.m_pszSignature = AAssit.ReadArrayFromBinary<ushort>(fp, MAX_TASK_NAME_LEN, ref readBytes);
}
```
### Pattern: String Conversion
For `task_char` arrays (which are `ushort` arrays in C#):
```csharp
// Read the array
fixedData.m_szName = AAssit.ReadArrayFromBinary<ushort>(fp, MAX_TASK_NAME_LEN, ref readBytes);
// Convert to readable string
TaskTemplUtils.convert_txt(ref fixedData.m_szName, MAX_TASK_NAME_LEN, TaskTemplUtils.uint_to_ushort(fixedData.m_ID));
string taskName = ByteToStringUtils.UshortArrayToCP936String(fixedData.m_szName);
```
---
## Reference: AAssit Helper Methods
From [AAssit.cs](mdc:Assets/PerfectWorld/Scripts/Common/DataProcess/AAssit.cs):
```csharp
// Read a single value
public static T ReadFromBinaryOf<T>(FileStream fp, ref long readBytes) where T : struct
// Read an array of values
public static T[] ReadArrayFromBinary<T>(FileStream fp, int count, ref long readBytes) where T : struct
```
---
## Best Practices
1. **Always track readBytes**: Pass `ref readBytes` to track how many bytes have been read for debugging
2. **Match C++ layout exactly**: Ensure struct packing in C# matches C++ (`Pack = 1` is common)
3. **Document your assumptions**: Add comments explaining the byte sizes you're skipping
4. **Verify with logs**: Log read values to verify correctness
5. **Check file position**: Use `fp.Position` to verify you're reading from the expected location
6. **Handle platform differences**: Be aware of 32-bit vs 64-bit pointer sizes (though binary files usually use fixed sizes)
---
## Common Mistakes to Avoid
❌ **Wrong**: Reading bool directly as bool type
```csharp
// C++: bool m_bHasSign
fixedData.m_bHasSign = AAssit.ReadFromBinaryOf<bool>(fp, ref readBytes); // Wrong!
```
✅ **Correct**:
```csharp
// Read as byte and compare with > 0
fixedData.m_bHasSign = AAssit.ReadFromBinaryOf<byte>(fp, ref readBytes) > 0; // Correct!
```
---
❌ **Wrong**: Skipping only `sizeof(basic type)` for a pointer in Case 5
```csharp
// C++: ushort* m_pszSignature
fp.Seek(2, SeekOrigin.Current); // Wrong! Do not skip sizeof(ushort) for pointers
```
✅ **Correct**:
```csharp
// Always skip the pointer address (4 bytes)
fp.Seek(4, SeekOrigin.Current);
```
---
❌ **Wrong**: Reading inline array as a single value
```csharp
// C++: char m_tmType[MAX_TIMETABLE_SIZE]
byte value = AAssit.ReadFromBinaryOf<byte>(fp, ref readBytes); // Wrong!
```
✅ **Correct**:
```csharp
byte[] values = AAssit.ReadArrayFromBinary<byte>(fp, MAX_TIMETABLE_SIZE, ref readBytes);
```
---
❌ **Wrong**: Not accounting for struct padding
```csharp
// C++: struct with padding
fp.Seek(20, SeekOrigin.Current); // Might be wrong due to padding!
```
✅ **Correct**:
```csharp
// Calculate actual size with padding
// Or read the struct directly if needed
fixedData.m_structField = AAssit.ReadFromBinaryOf<MyStruct>(fp, ref readBytes);
```