Oceanbase学习之一迁移mysql数据到oceanbase

devtools/2024/11/8 17:46:31/

一、数据库环境

#mysql环境

root@192.168.150.162 20:28:  [(none)]> select version();

±----------+

| version() |

±----------+

| 8.0.26    |

±----------+

1 row in set (0.00 sec)

root@192.168.150.162 20:28:  [(none)]> show variables like ‘%char%’;

±-------------------------±----------------------------------+

| Variable_name            | Value                             |

±-------------------------±----------------------------------+

| character_set_client     | utf8mb4                           |

| character_set_connection | utf8mb4                           |

| character_set_database   | utf8mb4                           |

| character_set_filesystem | binary                            |

| character_set_results    | utf8mb4                           |

| character_set_server     | utf8mb4                           |

| character_set_system     | utf8mb3                           |

| character_sets_dir       | /usr/local/mysql8/share/charsets/ |

±-------------------------±----------------------------------+

8 rows in set (0.00 sec)

#当前mysql环境下的表

#oceanbase环境

obclient [test]> select version();

±-----------------------------+

| version()                    |

±-----------------------------+

| 5.7.25-OceanBase_CE-v4.2.1.2 |

±-----------------------------+

1 row in set (0.002 sec)

obclient [test]> show variables like ‘%chara%’;

±-------------------------±--------+

| Variable_name            | Value   |

±-------------------------±--------+

| character_set_client     | utf8mb4 |

| character_set_connection | utf8mb4 |

| character_set_database   | utf8mb4 |

| character_set_filesystem | binary  |

| character_set_results    | utf8mb4 |

| character_set_server     | utf8mb4 |

| character_set_system     | utf8mb4 |

±-------------------------±--------+

7 rows in set (0.005 sec)

确认mysql与oceanbase的字符集一样

二、mysqldump迁移数据到OceanBase

通过MySQL下的mysqldump将数据导出为SQL文本格式,将数据备份文件传输到OceanBase数据库主机后,通过source命令导入到OceanBase数据库。

#当前mysql下的表

MySQL [(none)]> use test;

Reading table information for completion of table and column names

You can turn off this feature to get a quicker startup with -A

Database changed

MySQL [test]> show tables;

±------------------+

| Tables_in_test    |

±------------------+

| cluster_test      |

| cluster_test1     |

| cluster_test2     |

| cluster_test3     |

| cluster_test4     |

| t1                |

| t2                |

| t8                |

| t_smallint        |

| test_clustered    |

| test_nonclustered |

±------------------+

11 rows in set (0.00 sec)

# 通过mysqldump导出数据

mysqldump -h  192.168.150.162 -uroot -P4000 -p --database test  > test_oceanbase.sql

#传输脚本到oceanbase服务器

scp test_oceanbase.sql 192.168.150.116:/home/admin

#oceanbase导入

obclient [test]> source test_oceanbase.sql

obclient [test]> show tables;

±------------------+

| Tables_in_test    |

±------------------+

| cluster_test      |

| cluster_test1     |

| cluster_test2     |

| cluster_test3     |

| cluster_test4     |

| t1                |

| t2                |

| t8                |

| t_smallint        |

| test_clustered    |

| test_nonclustered |

±------------------+

11 rows in set (0.004 sec)

#抽查表和数据已经导入

obclient [test]> select * from big

-> ;

±---------------------±--------------------+

| id                   | id1                 |

±---------------------±--------------------+

| 18446744073709551615 | 9223372036854775807 |

±---------------------±--------------------+

1 row in set (0.003 sec)

三、通过datax从MySQL离线导入数据到OceanBase

#datax部署安装

datax 下载地址:https://datax-opensource.oss-cn-hangzhou.aliyuncs.com/202210/datax.tar.gz

1、直接服务器上下载datax

wget https://datax-opensource.oss-cn-hangzhou.aliyuncs.com/202210/datax.tar.gz

2、解压datax

[admin@localhost ~]$ tar zxvf datax.tar.gz

3、安装java

yum install java

4、测试datax是否安装成功

[admin@localhost bin]$ python datax.py …/job/job.json

DataX (DATAX-OPENSOURCE-3.0), From Alibaba !

Copyright © 2010-2017, Alibaba Group. All Rights Reserved.

2023-12-21 23:22:50.245 [main] INFO  MessageSource - JVM TimeZone: GMT+08:00, Locale: zh_CN

2023-12-21 23:22:50.248 [main] INFO  MessageSource - use Locale: zh_CN timeZone: sun.util.calendar.ZoneInfo[id=“GMT+08:00”,offset=28800000,dstSavings=0,useDaylight=false,transitions=0,lastRule=null]

2023-12-21 23:22:50.307 [main] INFO  VMInfo - VMInfo# operatingSystem class => sun.management.OperatingSystemImpl

2023-12-21 23:22:50.312 [main] INFO  Engine - the machine info  =>

osInfo: Red Hat, Inc. 1.8 25.392-b08

jvmInfo:        Linux amd64 3.10.0-1160.el7.x86_64

cpu num:        8

totalPhysicalMemory:    -0.00G

freePhysicalMemory:     -0.00G

maxFileDescriptorCount: -1

currentOpenFileDescriptorCount: -1

GC Names        [PS MarkSweep, PS Scavenge]

MEMORY_NAME                    | allocation_size                | init_size

PS Eden Space                  | 256.00MB                       | 256.00MB

Code Cache                     | 240.00MB                       | 2.44MB

Compressed Class Space         | 1,024.00MB                     | 0.00MB

PS Survivor Space              | 42.50MB                        | 42.50MB

PS Old Gen                     | 683.00MB                       | 683.00MB

Metaspace                      | -0.00MB                        | 0.00MB

2023-12-21 23:22:50.329 [main] INFO  Engine -

{

“content”:[

{

“reader”:{

“name”:“streamreader”,

“parameter”:{

“column”:[

{

“type”:“string”,

“value”:“DataX”

},

{

“type”:“long”,

“value”:19890604

},

{

“type”:“date”,

“value”:“1989-06-04 00:00:00”

},

{

“type”:“bool”,

“value”:true

},

{

“type”:“bytes”,

“value”:“test”

}

],

“sliceRecordCount”:100000

}

},

“writer”:{

“name”:“streamwriter”,

“parameter”:{

“encoding”:“UTF-8”,

“print”:false

}

}

}

],

“setting”:{

“errorLimit”:{

“percentage”:0.02,

“record”:0

},

“speed”:{

“channel”:1

}

}

}

2023-12-21 23:22:50.348 [main] WARN  Engine - prioriy set to 0, because NumberFormatException, the value is: null

2023-12-21 23:22:50.361 [main] INFO  PerfTrace - PerfTrace traceId=job_-1, isEnable=false, priority=0

2023-12-21 23:22:50.361 [main] INFO  JobContainer - DataX jobContainer starts job.

2023-12-21 23:22:50.365 [main] INFO  JobContainer - Set jobId = 0

2023-12-21 23:22:50.402 [job-0] INFO  JobContainer - jobContainer starts to do prepare …

2023-12-21 23:22:50.402 [job-0] INFO  JobContainer - DataX Reader.Job [streamreader] do prepare work .

2023-12-21 23:22:50.402 [job-0] INFO  JobContainer - DataX Writer.Job [streamwriter] do prepare work .

2023-12-21 23:22:50.402 [job-0] INFO  JobContainer - jobContainer starts to do split …

2023-12-21 23:22:50.403 [job-0] INFO  JobContainer - Job set Channel-Number to 1 channels.

2023-12-21 23:22:50.403 [job-0] INFO  JobContainer - DataX Reader.Job [streamreader] splits to [1] tasks.

2023-12-21 23:22:50.403 [job-0] INFO  JobContainer - DataX Writer.Job [streamwriter] splits to [1] tasks.

2023-12-21 23:22:50.421 [job-0] INFO  JobContainer - jobContainer starts to do schedule …

2023-12-21 23:22:50.426 [job-0] INFO  JobContainer - Scheduler starts [1] taskGroups.

2023-12-21 23:22:50.429 [job-0] INFO  JobContainer - Running by standalone Mode.

2023-12-21 23:22:50.438 [taskGroup-0] INFO  TaskGroupContainer - taskGroupId=[0] start [1] channels for [1] tasks.

2023-12-21 23:22:50.442 [taskGroup-0] INFO  Channel - Channel set byte_speed_limit to -1, No bps activated.

2023-12-21 23:22:50.442 [taskGroup-0] INFO  Channel - Channel set record_speed_limit to -1, No tps activated.

2023-12-21 23:22:50.468 [taskGroup-0] INFO  TaskGroupContainer - taskGroup[0] taskId[0] attemptCount[1] is started

2023-12-21 23:22:50.789 [taskGroup-0] INFO  TaskGroupContainer - taskGroup[0] taskId[0] is successed, used[335]ms

2023-12-21 23:22:50.790 [taskGroup-0] INFO  TaskGroupContainer - taskGroup[0] completed it’s tasks.

2023-12-21 23:23:00.452 [job-0] INFO  StandAloneJobContainerCommunicator - Total 100000 records, 2600000 bytes | Speed 253.91KB/s, 10000 records/s | Error 0 records, 0 bytes |  All Task WaitWriterTime 0.046s |  All Task WaitReaderTime 0.056s | Percentage 100.00%

2023-12-21 23:23:00.453 [job-0] INFO  AbstractScheduler - Scheduler accomplished all tasks.

2023-12-21 23:23:00.453 [job-0] INFO  JobContainer - DataX Writer.Job [streamwriter] do post work.

2023-12-21 23:23:00.453 [job-0] INFO  JobContainer - DataX Reader.Job [streamreader] do post work.

2023-12-21 23:23:00.453 [job-0] INFO  JobContainer - DataX jobId [0] completed successfully.

2023-12-21 23:23:00.454 [job-0] INFO  HookInvoker - No hook invoked, because base dir not exists or is a file: /home/admin/datax/hook

2023-12-21 23:23:00.455 [job-0] INFO  JobContainer -

[total cpu info] =>

averageCpu                     | maxDeltaCpu                    | minDeltaCpu

-1.00%                         | -1.00%                         | -1.00%

[total gc info] =>

NAME                 | totalGCCount       | maxDeltaGCCount    | minDeltaGCCount    | totalGCTime        | maxDeltaGCTime     | minDeltaGCTime

PS MarkSweep         | 0                  | 0                  | 0                  | 0.000s             | 0.000s             | 0.000s

PS Scavenge          | 0                  | 0                  | 0                  | 0.000s             | 0.000s             | 0.000s

2023-12-21 23:23:00.455 [job-0] INFO  JobContainer - PerfTrace not enable!

2023-12-21 23:23:00.456 [job-0] INFO  StandAloneJobContainerCommunicator - Total 100000 records, 2600000 bytes | Speed 253.91KB/s, 10000 records/s | Error 0 records, 0 bytes |  All Task WaitWriterTime 0.046s |  All Task WaitReaderTime 0.056s | Percentage 100.00%

2023-12-21 23:23:00.457 [job-0] INFO  JobContainer -

任务启动时刻                    : 2023-12-21 23:22:50

任务结束时刻                    : 2023-12-21 23:23:00

任务总计耗时                    :                 10s

任务平均流量                    :          253.91KB/s

记录写入速度                    :          10000rec/s

读出记录总数                    :              100000

读写失败总数                    :                   0

5创建datax-job的json

{

"job": {

"entry": {

"jvm": "-Xms1024m -Xmx1024m"

},

"setting": {

"speed": {

"channel": 4

},

"errorLimit": {

"record": 0,

"percentage": 0.1

}

},

"content": [{

"reader": {

"name": “mysqlreader”,

"parameter": {

"username": “root”,

"password": “oracle123”,

"column": [

"*"

],

"connection": [{

"table": [

"Tab_A"

],

"jdbcUrl": [“jdbc:mysql://192.168.150.162:4000/test?useUnicode=true&characterEncoding=utf8&useSSL=false”]

}]

}

},

"writer": {

"name": “oceanbasev10writer”,

"parameter": {

"obWriteMode": “insert”,

"column": [

"*"

],

"preSql": [

"truncate table Tab_A"

],

"connection": [{

"jdbcUrl": “||_dsc_ob10_dsc_||obdemo:obmysql||_dsc_ob10_dsc_||jdbc:oceanbase://192.168.150.116:2883/test?useLocalSessionState=true&allowBatch=true&allowMultiQueries=true&rewriteBatchedStatements=true”,

"table": [

"Tab_A"

]

}],

"username": “root”,

"password": “oracle123”,

"writerThreadCount": 10,

"batchSize": 1000,

"memstoreThreshold": "0.9"

}

}

}]

}

}

6、执行离线数据同步

源端数据:

MySQL [test]> select * from Tab_A;

±—±-----±-----±-----±-----±-----±-------+

| id | bid  | cid  | name | type | num  | amt    |

±—±-----±-----±-----±-----±-----±-------+

|  1 |    1 |    1 | A01  | 01   |  111 | 111.00 |

|  2 |    2 |    2 | A01  | 01   |  112 | 111.00 |

|  3 |    3 |    3 | A02  | 02   |  113 | 111.00 |

|  4 |    4 |    4 | A02  | 02   |  112 | 111.00 |

|  5 |    5 |    5 | A01  | 01   |  111 | 111.00 |

|  6 |    6 |    6 | A02  | 02   |  113 | 111.00 |

|  7 |    5 |    7 | A01  | 01   |  111 |  88.00 |

|  8 |    6 |    8 | A02  | 02   |  113 |  88.00 |

±—±-----±-----±-----±-----±-----±-------+

8 rows in set (0.26 sec)

目标数据:

obclient [test]> select * from Tab_A;

Empty set (0.133 sec)

执行同步:

python ./datax.py …/job/mysql2ob.json

2023-12-22 00:42:13.745 [job-0] INFO  JobContainer - PerfTrace not enable!

2023-12-22 00:42:13.745 [job-0] INFO  StandAloneJobContainerCommunicator - Total 8 records, 134 bytes | Speed 13B/s, 0 records/s | Error 0 records, 0 bytes |  All Task WaitWriterTime 0.020s |  All Task WaitReaderTime 0.000s | Percentage 100.00%

2023-12-22 00:42:13.747 [job-0] INFO  JobContainer -

任务启动时刻                    : 2023-12-22 00:41:54

任务结束时刻                    : 2023-12-22 00:42:13

任务总计耗时                    :                 19s

任务平均流量                    :               13B/s

记录写入速度                    :              0rec/s

读出记录总数                    :                   8

读写失败总数                    :                   0

7、检查数据:

obclient [test]>  select * from Tab_A;

±—±-----±-----±-----±-----±-----±-------+

| id | bid  | cid  | name | type | num  | amt    |

±—±-----±-----±-----±-----±-----±-------+

|  1 |    1 |    1 | A01  | 01   |  111 | 111.00 |

|  2 |    2 |    2 | A01  | 01   |  112 | 111.00 |

|  3 |    3 |    3 | A02  | 02   |  113 | 111.00 |

|  4 |    4 |    4 | A02  | 02   |  112 | 111.00 |

|  5 |    5 |    5 | A01  | 01   |  111 | 111.00 |

|  6 |    6 |    6 | A02  | 02   |  113 | 111.00 |

|  7 |    5 |    7 | A01  | 01   |  111 |  88.00 |

|  8 |    6 |    8 | A02  | 02   |  113 |  88.00 |

±—±-----±-----±-----±-----±-----±-------+

8 rows in set (0.002 sec)


http://www.ppmy.cn/devtools/132363.html

相关文章

Django中ListView 和 DetailView类的区别

在Django中,ListView 和 DetailView都是基于类的视图,在处理请求时通常会自动调用render_to_response函数,但由于项目需求不同,实现时需要重写render_to_response函数 ListView 和 DetailView介绍 ListView 介绍主要用于展示一个…

GEE| 对Landsat 8 影像进行主成分分析

对 Landsat 8 影像进行主成分分析并实现可视化 // 定义感兴趣区域 var roi ee.FeatureCollection(users/yongweicao11/Dongguan2022);// 加载 Landsat 8 图像集合,并根据 ROI 和日期范围进行过滤 var imageL8 ee.ImageCollection("LANDSAT/LC08/C02/T1_TOA…

大数据数据存储层MemSQL, HBase与HDFS

以下是对 MemSQL、HBase 和 HDFS 的详细介绍,这些工具在分布式数据存储和处理领域有着重要作用。 1. MemSQL MemSQL(现称为 SingleStore)是一种分布式内存数据库,兼具事务处理(OLTP)和分析处理(OLAP)的能力,专为高性能实时数据处理设计。 1.1 核心特点 内存优先存储…

【MongoDB】MongoDB的Java API及Spring集成(Spring Data)

文章目录 Java APISpring 集成1. 添加依赖2. 配置 MongoDB3. 创建实体类4. 创建 Repository 接口5. 创建 Service 类6. 创建 Controller 类7. 启动 Spring Boot 应用8. 测试你的 API 更多相关内容可查看 Java API maven <dependency><groupId>org.mongodb</gr…

无人机干扰与抗干扰,无人机与反制设备的矛与盾

无人机干扰与抗干扰&#xff0c;以及无人机与反制设备之间的关系&#xff0c;可以形象地比喻为矛与盾的较量。以下是对这两方面的详细探讨&#xff1a; 一、无人机干扰与抗干扰 1. 无人机干扰技术 无人机干扰技术是指通过各种手段对无人机系统进行干扰&#xff0c;使其失去正…

基于卷积神经网络的农作物病虫害识别系统(pytorch框架,python源码)

更多图像分类、图像识别、目标检测等项目可从主页查看 功能演示&#xff1a; 基于卷积神经网络的农作物病虫害检测&#xff08;pytorch框架&#xff09;_哔哩哔哩_bilibili &#xff08;一&#xff09;简介 基于卷积神经网络的农作物病虫害识别系统是在pytorch框架下实现的…

mybatis源码解析-sql执行流程

1 执行器的创建 1. SimpleExecutor 描述&#xff1a;最基本的执行器&#xff0c;每次查询都会创建新的语句对象&#xff0c;并且不会缓存任何结果。 特点&#xff1a; 每次查询都会创建新的 PreparedStatement 对象。 不支持一级缓存。 适用于简单的查询操作&#xff0c;不…

Python | Leetcode Python题解之第542题01矩阵

题目&#xff1a; 题解&#xff1a; class Solution:def updateMatrix(self, matrix: List[List[int]]) -> List[List[int]]:m, n len(matrix), len(matrix[0])# 初始化动态规划的数组&#xff0c;所有的距离值都设置为一个很大的数dist [[10**9] * n for _ in range(m)]…