How to create tonnes of dummy data in MySQL

How to create tonnes of dummy data in MySQL

When we are testing various database concepts, we need mock data. Sometimes, we need a lot of mock data.

There are two really good methods to generate mock data in your MYSQL tables. The first, is in Python; using the Faker library to generate data which is relatively representative of data you might use in your live databases. An example of doing this is below. I have discussed Faker previously, here, this just expands on that by connecting to the MySQL database too.

from sqlalchemy import create_engine
from faker import Faker
import mysql.connector
import pandas as pd
from faker.providers import internet, address, automotive, bank, barcode, company, credit_card, currency, date_time, file, geo, job, misc, person, phone_number, user_agent
fake = Faker('en-GB')
from random import randint

i=0
while i <= 1000000000:
    i = i+1
    y = str(i)
    customer_name = fake.name()
    address = fake.bs()
    place = fake.city()
    favorite_quote = fake.catch_phrase()
    favorite_meal = fake.catch_phrase()
    field = fake.catch_phrase()
    field2 = fake.catch_phrase()
    field3 = randint(1, 30)
    field3 = str(field3)

    #####SOURCE
    try:
        #source table conn
        engine = create_engine("mysql+pymysql://username:password@localhost/ran",echo = True)
        iconn = engine.connect()
    except Exception as e:
        print('extract: ' + str(e))

    try:  
        query = "INSERT INTO ran.dbmigrate SELECT " + y + " as i, '" + customer_name + "' , '" + address + "' , '" + place + "' , '" + favorite_quote + "' , '" + favorite_meal + "' , '" + field + "' , '" + field2 + "' ,'" + field3 + "'"

        data = pd.read_sql(query, iconn)
    except Exception as e:
        print('extract: ' + str(e))

The other option is to create a stored procedure. I’ve discussed them here. In the below; we create a table called mybigtablex, with a number of fields.

Within the procedure, we run a while loop – while i < 1000, then insert some data. This is less flexible as you can’t generate really representative data like you can with Faker but it’s a great option when you just need lots of data!

DROP PROCEDURE make_lots_data;

CREATE TABLE `mybigtablex` 
(
  `id`                bigint(20) NOT NULL      AUTO_INCREMENT,
  `order_date`        timestamp  NULL          DEFAULT CURRENT_TIMESTAMP,
  `sales_category`    int(11)                  DEFAULT NULL,
  `store_id`          varchar(40)              DEFAULT NULL,
  `order_value`       float                    DEFAULT NULL,

  PRIMARY KEY (`id`)
);


DELIMITER $$
CREATE PROCEDURE make_lots_data()
BEGIN
  DECLARE i INT DEFAULT 0;
  WHILE i < 1000 DO
    INSERT INTO `mybigtablex` (`order_date`,`sales_category`,`store_id`, `order_value`) VALUES (
      FROM_UNIXTIME(UNIX_TIMESTAMP('2022-01-01 01:00:00')+FLOOR(RAND()*31536000)),
ROUND(RAND()*10,0),
1,
      ROUND(RAND()*100,2)
    );
    SET i = i + 1;
  END WHILE;
END$$
DELIMITER ;

CALL make_lots_data();

Either option will give you lots of data to play with to test your MySQL functions.

Total
0
Shares
Previous Article
MySQL Querying A Partitioned Table

MySQL Querying A Partitioned Table

Related Posts