Python SQLite3でデータベースを扱う

PythonでSQLite3(エスキューライト)でデータベースを扱う方法を紹介します。SQLite3は軽量でコンパクトなファイルベースのリレーショナルデータベースシステムです。

はじめに
SQLiteのGUIの管理ツール
SQLite3の操作の基本
データベースを作成
1. データベースの作成
2. 利用できるフィールドのデータ型
レコードを登録（挿入）
1. 追加したレコードのIDを取得
レコードを更新
レコードの検索
検索結果の件数を取得
レコードを削除
テーブルを削除
Pandasデータフレーム
1. Pandasデータフレームからテーブルを作成
2. SQLiteのデータをPandasデータフレームで取得

はじめに

SQLite3のインストール

Macで初めからインストールされています。

ターミナスでインストールされているか確認しましょう。
以下のコマンドでバーションを確認することでインストールされているかわかります。

sqlite3 -version

3.28.0 2019-04-15 14:49:49

Windowsの場合は以下よりダウンロードしてインストールしてください。

SQLite Home Page

「Precompiled Binaries for Windows」の「64-bit DLL (x64) for SQLite」をダウンロードして解凍します。

Pythonのパッケージのインストール

SQLite3は標準パッケージですので、インストールは不要です。

sqlite3をインポート

import sqlite3

本ページではインポートされている前提で紹介します。

参考サイト

SQLite 公式サイト

SQLite Home Page

SQLite Tutorial

SQLite Tutorial - An Easy Way to Master SQLite Fast

This SQLite tutorial teaches you everything you need to know to start using SQLite effectively. You will learn SQLite via extensive hands-on practices.

Python sqlite3

sqlite3 --- SQLite データベース用の DB-API 2.0 インターフェース

ソースコード: Lib/sqlite3/ SQLite は、軽量なディスク上のデータベースを提供する C ライブラリです。別のサーバプロセスを用意する必要なく、 SQL クエリー言語の非標準的な一種を使用してデータベースにアクセスできます。一部のアプリケーションは内部データ保存に SQLite を使えます。また、SQL...

SQLiteのGUIの管理ツール

定番の管理ツールとして、DB Browser for SQLiteがあります。

Windowsだけでなく、MacやLinuxなどでも利用できます。

公式サイト

DB Browser for SQLite

SQLite3の操作の基本

データベースに接続

connection = sqlite3.connect(dbpath)

dbpathは任意のデータベースのファイル名とそのパスを設定してください。
ファイルが存在しない場合は空のデータベースのファイルを作成します。

Cursorオブジェクトを作成

cursor = connection.cursor()

Cursorオブジェクトは主に以下のような機能があります。

execute：SQL 文を実行します。
fetchone：クエリ結果から1つのシーケンスを返します。
fetchall：全てのクエリ結果をリストで返します。
close：接続を閉じます。

トランザクションをコミット

connection.commit()

SQLite3ではコミットしないとデータが反映されません。
（コミットしなくても反映させる方法はあるのですが、コミットを利用するようにしたほうが良いでしょう）

データベースを切断

connection.close()

データベースを作成

データベースの作成

import sqlite3

dbpath = "./log.sqlite"

connection = sqlite3.connect(dbpath)
cursor = connection.cursor()

sql = """
    CREATE TABLE IF NOT EXISTS Log (
        id INTEGER PRIMARY KEY AUTOINCREMENT,
        dates TIMESTAMP,
        messages TEXT
    )
"""

cursor.execute(sql)
connection.close()

CREATE TABLE IF NOT EXISTS テーブル名(フィールド名データ型, )

IF NOT EXISTS：テーブルが存在しない場合は作成させます。これは無くてもいいです。
変数にSQL文を代入しているのですが、長くなるのでダブルクォーテーションを3つ（"""）で囲むと改行が自由になります。

利用できるフィールドのデータ型

INTEGER：整数。値の大きさに応じて、1、2、3、4、6、または8バイトに格納されます。
REAL：浮動小数点。8バイトのIEEE浮動小数点数として格納されます。
TEXT：テキスト。文字コードの対応はUTF-8、UTF-16BE、UTF-16LEのみ
BLOB：なんでも。入力されたとおりに格納されます。

補足

Boolean型はない。
DateやTime型はない。
PythonのAPIによってTIMESTAMP型の指定は可能です。しかしTIMESTAMPとして扱われるのではなく、文字列として保存しています。そのためTIMESTAMPとして指定しても自動的に挿入されることはありません。空の場合はNULLとなります。

レコードを登録（挿入）

from datetime import datetime

connection = sqlite3.connect(dbpath)
cursor = connection.cursor()

now = datetime.now() 
messages = "TEST"

sql = "INSERT INTO Log (dates,messages) VALUES (?, ?)"
cursor.execute(sql, (now, messages))

connection.commit()
connection.close()

sql = "INSERT INTO テーブル名 (フィールド名1,フィールド名2) VALUES (?, ?)"

sqlにSQL文を代入します。
値は?とします。

cursor.execute(sql, (値1, 値2))

cursorのexecuteでSQL文を実行します。
値はタプル型でまとめます。

登録するフィールドが一つの場合

sql = "INSERT INTO テーブル名 (フィールド名) VALUES (?)"

cursor.execute(sql, (値, ))

(messages, )のように、終わりにもカンマが必要です。カンマがないとエラーになります。

追加したレコードのIDを取得

cursor.lastrowidにて取得することができます。

connection = sqlite3.connect(dbpath)
cursor = connection.cursor()

now = datetime.now() 
messages = "TEST"

sql = "INSERT INTO Log (dates,messages) VALUES (?, ?)"
cursor.execute(sql, (now, messages))

lastrowid = cursor.lastrowid
connection.commit()

sql = "SELECT * from Log where id = {}".format(lastrowid)
cursor.execute(sql)
data = cursor.fetchone()

connection.close()

print(data)
# (6, '2020-10-10 13:38:48.551200', 'TEST')

レコードを更新

connection = sqlite3.connect(dbpath)
cursor = connection.cursor()

id = 1
messages = "TEST2"

sql = "UPDATE Log SET messages = ? where id = ?"
cursor.execute(sql, (messages, id))

connection.commit()
connection.close()

sql = "UPDATE Log SET フィールド名 = ? where キーのフィールド名 = ?"

sqlにSQL文を代入します。
whereで条件を決めます。
キーと値は?とします。

cursor.execute(sql, (messages, id))

cursorのexecuteでSQL文を実行します。

レコードの検索

単一のフィールドを1件

connection = sqlite3.connect(dbpath)
cursor = connection.cursor()

sql = "SELECT id from Log"
cursor.execute(sql)
data = cursor.fetchone()

connection.commit()
connection.close()

print(data)
# (1,)

cursor.fetchone()

検索結果から初めの1つのシーケンスを返します。

全てのフィールドを全件取得

connection = sqlite3.connect(dbpath)
cursor = connection.cursor()

sql = "SELECT * from Log"
cursor.execute(sql)

data = cursor.fetchall()

connection.commit()
connection.close()

print(data)
# [(1, '2020-07-08 17:57:33.304128', 'TEST2'), (2, None, 'TEST')]

cursor.fetchall()

全ての検索結果をリストで返します。

For文で値を取得

connection = sqlite3.connect(dbpath)
cursor = connection.cursor()

sql = "SELECT * from Log"
cursor.execute(sql)
for data in cursor.fetchall():
    print(data)

connection.commit()
connection.close()

# (1, '2020-07-08 16:37:25.783336', 'TEST2')
# (2, '2020-07-08 16:37:34.170706', 'TEST')

単一の条件で全件を取得

connection = sqlite3.connect(dbpath)
cursor = connection.cursor()

id = 2
sql = "SELECT * from Log where id = ?"
cursor.execute(sql, (id,))
for data in cursor.fetchall():
    print(data)

connection.commit()
connection.close()

# (2, None, 'TEST')

複数の条件で全件を取得

connection = sqlite3.connect(dbpath)
cursor = connection.cursor()

id = 2
messages = "TEST"

sql = "SELECT * from Log where id = ? and messages = ?"
cursor.execute(sql, (id,messages))
for data in cursor.fetchall():
    print(data)

connection.commit()
connection.close()

# (2, None, 'TEST')

合計値を取得

sum(フィールド名)で取得することができます。

where文で条件を付けた場合など、検索対象がない場合の返り値は以下のようになります。
0でもなく、Noneでもなく、(None,)となりますので、注意してください。

(None,)

connection = sqlite3.connect(dbpath)
cursor = connection.cursor()
sql = "SELECT sum(id) from Log"
cursor.execute(sql)
data = cursor.fetchone()
connection.close()
print(data)
# (21,)

最大値・最小値を検索（取得）

max(フィールド名), min(フィールド名)で取得することができます。

connection = sqlite3.connect(dbpath)
cursor = connection.cursor()

sql = "SELECT max(dates), min(dates) from Log"
cursor.execute(sql)
data = cursor.fetchone()

connection.close()

print(data)
# ('2020-10-10 13:38:48.551200', '2020-07-08 17:57:33.304128')

最大値・最小値のレコードを検索

最大値・最小値の取得の応用になりますが、whereに最大値・最小値を取得するSQL文を設定します。

以下は最大値のサンプルですが、最小値にする場合は、minとしてください。

connection = sqlite3.connect(dbpath)
cursor = connection.cursor()

sql = "SELECT *from Log where dates = (SELECT max(dates) from Log)"
cursor.execute(sql)
data = cursor.fetchone()

connection.close()

print(data)
# (6, '2020-10-10 13:38:48.551200', 'TEST')

LIKE（ワイルドカード）で検索

connection = sqlite3.connect(dbpath)
cursor = connection.cursor()

messages = "TE_T%"
sql = "SELECT * from Log where messages LIKE ?"
cursor.execute(sql, (messages,))

for data in cursor.fetchall():
    print(data)

connection.commit()
connection.close()

# (1, '2020-07-08 17:57:33.304128', 'TEST2')
# (2, None, 'TEST')

LIKEを使用します。
%で前方・後方を任意に
_で文字列の一文字を任意にできます。複数文字の場合はその分の_を挿入します。

NULLを検索

connection = sqlite3.connect(dbpath)
cursor = connection.cursor()

sql = "SELECT * from Log where dates is NULL"
cursor.execute(sql)

for data in cursor.fetchall():
    print(data)

connection.commit()
connection.close()

# (2, None, 'TEST')

フィールド名 is NULLとします。

日付の検索

connection = sqlite3.connect(dbpath)
cursor = connection.cursor()

dates = "2020-07-08"

sql = "SELECT * from Log where dates >= ?"
cursor.execute(sql, (dates, ))
for data in cursor.fetchall():
    print(data)

connection.commit()
connection.close()

# (1, '2020-07-08 17:57:33.304128', 'TEST2')

SQLite3では厳密には日付や日時型のデータ型は存在しませんが、擬似的に検索することができます。

検索結果のレコードを連想配列で取得

connection = sqlite3.connect(dbpath)

connection.row_factory = sqlite3.Row

cursor = connection.cursor()

sql = "SELECT * from Log"
cursor.execute(sql)

for data in cursor.fetchall():
    print(data["id"], data["dates"], data["messages"])

connection.commit()
connection.close()

# 1 2020-07-08 17:57:33.304128 TEST2
# 2 None TEST

connection.row_factory = sqlite3.Row

この一文を入れることにより、連想配列で取得することができます。
それぞれの意味を理解するのは難しいので、ここでは説明しません。気になる方は以下を参照してください。
- connection.row_factory
- sqlite3.Row
data = cursor.fetchall()としてprint(data)すると、以下のように表示されるので注意してください。
[<sqlite3.Row object at 0x110c30bf0>, <sqlite3.Row object at 0x110c30e50>]

検索結果の件数を取得

connection = sqlite3.connect(dbpath)
cursor = connection.cursor()

id = 1
dates = "2020-07-08"

sql = "SELECT * from Log where id = ? and dates >= ?"
cursor.execute(sql, (id, dates))

data = cursor.fetchall()

connection.commit()
connection.close()

print(data)
# [(1, '2020-07-08 17:57:33.304128', 'TEST2')]

print(len(num))
# 2

SQLite3では件数を取得する関数が用意されていないため、fetchall() で全件取得してリスト型の件数を取得する必要があります。

レコードを削除

connection = sqlite3.connect(dbpath)
cursor = connection.cursor()

id = 2

sql = "DELETE from Log where id = ?"
cursor.execute(sql, (id, ))

connection.commit()
connection.close()

テーブルを削除

connection = sqlite3.connect(dbpath)
cursor = connection.cursor()

cursor.execute("DROP TABLE IF EXISTS Log")

connection.close()

Pandasデータフレーム

Pandasデータフレームからテーブルを作成

df.to_sql('テーブル名', connection, if_exists='replace')

if_exists='replace'：もし同一テーブル名が存在していたら置き換える。
データベースのインデックスを指定する場合は、予めPandasデータフレームでインデックスを指定すると引き継がれます。

import sqlite3
import pandas as pd

dbpath = "./log.sqlite"

connection = sqlite3.connect(dbpath)
cursor = connection.cursor()

df = pd.DataFrame([[1,2,3],[4,5,6],[7,8,9]], columns = ["aaa", "bbb", "ccc"])
print(df)
#    aaa  bbb  ccc
# 0    1    2    3
# 1    4    5    6
# 2    7    8    9

df.to_sql('test', connection, if_exists='replace')

sql = 'SELECT * FROM test'
for row in cursor.execute(sql):
    print(row)

# (0, 1, 2, 3)
# (1, 4, 5, 6)
# (2, 7, 8, 9)

df = pd.DataFrame([[1,2,3],[4,5,6],[7,8,9]], columns = ["aaa", "bbb", "ccc"])
df = df.set_index('aaa')
print(df)
#      bbb  ccc
# aaa          
# 1      2    3
# 4      5    6
# 7      8    9

df.to_sql('test', connection, if_exists='replace')

sql = 'SELECT * FROM test'
for row in cursor.execute(sql):
    print(row)
    
# (1, 2, 3)
# (4, 5, 6)
# (7, 8, 9)

cursor.close()
connection.close()

SQLiteのデータをPandasデータフレームで取得

pd.read_sql(sql, connection)

インデックスの指定：index_col='index'

import sqlite3
import pandas as pd

dbpath = "./log.sqlite"

connection = sqlite3.connect(dbpath)

sql = 'SELECT * FROM test'

df  = pd.read_sql(sql, connection)
print(df)
#    index  aaa  bbb  ccc
# 0      0    1    2    3
# 1      1    4    5    6
# 2      2    7    8    9

df  = pd.read_sql(sql, connection, index_col='index')
print(df)
#        aaa  bbb  ccc
# index               
# 0        1    2    3
# 1        4    5    6
# 2        7    8    9

connection.close()