1. 程式人生 > >臨時處理小記:把Numpy的narray二進制文件轉換成json文件

臨時處理小記:把Numpy的narray二進制文件轉換成json文件

json格式 result vertical bce load ann exec user dff

臨時處理一個Numpy的二進制文件,分析知道裏面是dict類型,簡單小記一下,如果Numpy和Python基礎不熟悉可以看我之前寫的文章

In [1]:
%%time

import numpy as np
Wall time: 135 ms
In [2]:
%%time

import pandas as pd
Wall time: 351 ms
In [3]:
%%time

df = pd.DataFrame(np.load("data.npy")) # 通過narry創建DataFrame
Wall time: 910 ms
In [4]:
%%time

df.head(10) # 快速預覽前10行
Wall time: 1 ms
Out[4]:
0
0 {‘email‘: ‘[email protected]‘, ‘pwd‘: ‘9755DD0556...
1 {‘email‘: ‘[email protected]‘, ‘pwd‘: ‘6BB518D1A42...
2 {‘email‘: ‘[email protected]‘, ‘pwd‘: ‘0079ABBA6...
3 {‘email‘: ‘[email protected]‘, ‘pwd‘: ‘E23E561F02...
4 {‘email‘: ‘[email protected]‘, ‘pwd‘: ...
5 {‘email‘: ‘[email protected]‘, ‘pwd‘: ‘9B084...
6 {‘email‘: ‘[email protected]‘, ‘pwd‘: ‘7D07...
7 {‘email‘: ‘[email protected]‘, ‘pwd‘: ‘448A2...
8 {‘email‘: ‘[email protected]‘, ‘pwd‘: ‘DBF...
9 {‘email‘: ‘[email protected]‘, ‘pwd‘: ‘22DDD26D...
In [5]:
%%time

# 提取email列
df[‘Email‘] = df[0].map(lambda x : dict(x)["email"])
# 提取pwd列
df[‘MD5‘] = df[0].map(lambda x : dict(x)["pwd"] )
# 刪除無用列
del df[0]
Wall time: 1.05 s
In [6]:
%%time

df.size # 查看總共多少數據
Wall time: 0 ns
Out[6]:
2097148
In [7]:
%%time

df.shape
Wall time: 0 ns
Out[7]:
(1048574, 2)
In [8]:
%%time

df.head(10)
Wall time: 0 ns
Out[8]:
EmailMD5
0 [email protected] 9755DD05564EAD9EADCACE40B5A02711
1 [email protected] 6BB518D1A42F22DA5CA62D5EE41C5D4F
2 [email protected] 0079ABBA66856DAFDF2B9A6E0DB23A09
3 [email protected] E23E561F0202ACECA30B8F07A48AB8E9
4 [email protected] 0EB1A2DB91A2BF3FB6275DE659A25805
5 [email protected] 9B08473C992C07E98389ED1C280A634A
6 [email protected] 7D0710824FF191F6A0086A7E3891641E
7 [email protected] 448A2BCEE09A3B14C22DC000351216B7
8 [email protected] DBFBA02E366BAB58DF605D6475189A51
9 [email protected] 22DDD26D62AF8B1C4A216BE18FDFF5B2
In [9]:
%%time

df.T.to_json("user.json") # 重新保存為Json(轉置只是為了存儲成我們常見的json格式)
Wall time: 2.85 s

技術分享圖片

臨時處理小記:把Numpy的narray二進制文件轉換成json文件