元数据映射说明
描述
CloudCanal 任务配置中,比较难理解的包含两部分内容,一是描述数据源库表列等元数据的结构,二是源和目标元数据映射关系。
本文档主要介绍如何理解源和目标元数据映射关系。
举个例子
如下结构一般出现在 DataJob 配置 mappingDef 中,表示源和目标元数据多层映射。一般和 srcSchema 配合计算出对端相应元数据。这个结构有几个特点
- 以json数组表示多个层次的映射
- 当前层次的映射元素从低向上呈现
- 一个层次的映射有优先级,serializeMapping 最优先,serializeAutoGenRules 次之(当前未使用),commonGenRule 打底
MySQL->PostgreSQL 库表列映射示例
[
{
"method": "DB_SCHEMA",
"serializeMapping": {
"{\"value\":\"dingtax\"}": "{\"parent\":{\"value\":\"dingtax_target\"},\"value\":\"public\"}"
},
"serializeAutoGenRules": {},
"commonGenRule": "MIRROR"
},
{
"serializeMapping": {
"{\"parent\":{\"value\":\"dingtax\"},\"value\":\"worker_stats\"}": "{\"value\":\"re_worker_stats\",\"parent\":{\"value\":\"public\",\"parent\":{\"value\":\"dingtax_target\"}}}"
},
"method": "TABLE_TABLE",
"serializeAutoGenRules": {},
"commonGenRule": "MIRROR"
},
{
"method": "COLUMN_COLUMN",
"serializeMapping": {
"{\"parent\":{\"value\":\"worker_stats\",\"parent\":{\"value\":\"dingtax\"}},\"value\":\"worker_id\"}": "{\"value\":\"re_worker_id\",\"parent\":{\"value\":\"re_worker_stats\",\"parent\":{\"value\":\"public\",\"parent\":{\"value\":\"dingtax_target\"}}}}",
"{\"parent\":{\"value\":\"worker_stats\",\"parent\":{\"value\":\"dingtax\"}},\"value\":\"cpu_stat\"}": "{\"value\":\"re_cpu_stat\",\"parent\":{\"value\":\"re_worker_stats\",\"parent\":{\"value\":\"public\",\"parent\":{\"value\":\"dingtax_target\"}}}}"
},
"serializeAutoGenRules": {},
"commonGenRule": "MIRROR"
}
]
MySQL->Kafka 表topic映射示例
[
{
"serializeMapping": {
"{\"parent\":{\"value\":\"dingtax\"},\"value\":\"unsigned_table\"}": "{\"value\":\"my-6716s8rryux1366.dingtax.unsigned_table\"}",
"{\"parent\":{\"value\":\"dingtax\"},\"value\":\"worker_stats\"}": "{\"value\":\"my-6716s8rryux1366.dingtax.worker_stats\"}"
},
"method": "TABLE_TOPIC",
"serializeAutoGenRules": {},
"commonGenRule": "MIRROR"
},
{
"method": "COLUMN_COLUMN",
"serializeMapping": {},
"serializeAutoGenRules": {},
"commonGenRule": "MIRROR"
}
]
method 可选择的值
method 参数表明了当前层次映射所计算出的结果值是什么。可选的参数包括
DB_DB,
SCHEMA_SCHEMA,
TABLE_TABLE,
COLUMN_COLUMN,
// jump level
DB_SCHEMA,
SCHEMA_DB,
// mq
DB_TOPIC,
TABLE_TOPIC,
TOPIC_TABLE,
TOPIC_INDEX,
ANY_DB,
// es
TABLE_INDEX,
// cache key prefix
TABLE_KEYPREFIX
serializeMapping KV格式
serializeMapping 信息表示当前层次映射详情,其中 key 或者 value 有哪些信息,可以通过 数据源元数据映射信息 接口获取,所获取到的 parents 按顺序构建
commonGenRule 可选的值
commonGenRule 为兜底的映射规则,如果 serializeMapping 未指定,serializeAutoGenRules 未设置特殊规则,则使用 commonGenRule 计算。
无特殊情况,一般设置 MIRROR 规则即可,即和源端保持一致
MIRROR
TO_LOWER_CASE
TO_UPPER_CASE
/**
* instance_id,db_name -> instance_id.db_name
*/
SOURCE_INS_DB_BY_DOT
/**
* instance_id,db_name,table -> instance_id.db_name.table
*/
SOURCE_INS_DB_TABLE_BY_DOT
/**
* instance_id,db_name,schema_name,table -> instance_id.db_name.schema_name.table
*/
SOURCE_INS_DB_SCHEMA_TABLE_BY_DOT
/**
* instance_id,db_name -> instance_id%db_name
*/
SOURCE_INS_DB_BY_PERCENT
/**
* instance_id,db_name,table -> instance_id%db_name%table
*/
SOURCE_INS_DB_TABLE_BY_PERCENT
/**
* instance_id,db_name,schema_name,table -> instance_id%db_name%schema_name%table
*/
SOURCE_INS_DB_SCHEMA_TABLE_BY_PERCENT
/**
* instance_id.db_name.schema_name.table -> table
*/
SOURCE_LAST_ITEM_BY_DOT
/**
* instance_id%db_name%schema_name%table -> table
*/
SOURCE_LAST_ITEM_BY_PERCENT
NUMBER_SUFFIX_REG_EXPRESSION
DEFAULT_TOPIC
/**
* db_name,table -> db_name:table
*/
SOURCE_DB_TABLE_BY_COLON
/**
* db_name,schema_name,table -> db_name:schema_name:table
*/
SOURCE_DB_SCHEMA_TABLE_BY_COLON
/**
* aa_bb_cc -> aaBbCc
*/
TO_CAMEL_FORMAT