根据值插入行并更新其他列?

2024-10-02 14:16:53 发布

您现在位置:Python中文网/ 问答频道 /正文

我对pandas模块不熟悉,在工作中使用它进行数据分析。我有一个excel表格,每天从access数据库导入数据,每次机器停机时都会插入新的记录。该表基本上显示了每台机器的正常运行时间百分比

ID | Area | Machine | Week | UTPercent
--------------------------------------
1  |  A1  |   M1    |   1  |  80
2  |  A1  |   M1    |   4  |  90
3  |  A2  |   M2    |   4  |  70
4  |  A2  |   M2    |   8  |  82

综上所述,如果当前周为8,则已跳过第2、3、5、6、7、8周(对于机器1)和第1、2、3、5、6和7周(对于机器2)。如何在中间添加行,并将UTPercent作为所有这些行的100%?换句话说,这就是我需要的。你知道吗

ID  | Area | Machine | Week | UTPercent
--------------------------------------
1   |  A1  |   M1    |   1  |  80
2   |  A1  |   M1    |   2  |  100
3   |  A1  |   M1    |   3  |  100
4   |  A1  |   M1    |   4  |  90
5   |  A1  |   M1    |   5  |  100
6   |  A1  |   M1    |   6  |  100
7   |  A1  |   M1    |   7  |  100
8   |  A1  |   M1    |   8  |  100
9   |  A1  |   M2    |   1  |  100
10  |  A2  |   M2    |   2  |  100
11  |  A2  |   M2    |   3  |  100
12  |  A2  |   M2    |   4  |  70
13  |  A2  |   M2    |   5  |  100
14  |  A2  |   M2    |   6  |  100
15  |  A2  |   M2    |   7  |  100
16  |  A2  |   M2    |   8  |  82

另外,在区域1中仅为机器1绘制条形图时,如何添加数据标签?我做了一周(x轴)和正常运行时间百分比(y轴)的条形图。我需要几个星期作为我的数据标签。你知道吗

以下是我迄今为止所做的:

import matplotlib.plot as plt
import pandas as pd

df = pd.read_excel("targetFolder.xlsx", sheetname = 0, sep ='|')

area1 = df.loc[df['Area'] == 'A1']

# the data

data = list(area1['UTPercent'])
weekNum = list(df.Week)

## the bars
fig = plt.figure()
ax1 = fig.add_subplot(111)
plotData = ax1.bar(weekNum, data, width = 0.45, 
color='#556B2F')

# adding labels and title
ax1.set_xlabel("Weeks")
ax1.set_ylabel("Uptime Percentage")
ax1.set_title("Metrology Area", weight='bold')

fig.tight_layout()
fig.gca()

Tags: 数据机器a2pandasdfdataa1fig
1条回答
网友
1楼 · 发布于 2024-10-02 14:16:53

对于第一个问题,我会这样做(假设您的表名为uptimes):

INSERT INTO uptimes (Week, Machine, Area, UTPercent)
    (SELECT SeqValue AS Week,
            machines.Machine,
            machines.Area,
            100 AS UTPercent
     FROM
         (SELECT (TWO_1.SeqValue + TWO_2.SeqValue + TWO_4.SeqValue + TWO_8.SeqValue + TWO_16.SeqValue + TWO_32.SeqValue) SeqValue
          FROM
              (SELECT 0 SeqValue
               UNION ALL SELECT 1 SeqValue) TWO_1
          CROSS JOIN
              (SELECT 0 SeqValue
               UNION ALL SELECT 2 SeqValue) TWO_2
          CROSS JOIN
              (SELECT 0 SeqValue
               UNION ALL SELECT 4 SeqValue) TWO_4
          CROSS JOIN
              (SELECT 0 SeqValue
               UNION ALL SELECT 8 SeqValue) TWO_8
          CROSS JOIN
              (SELECT 0 SeqValue
               UNION ALL SELECT 16 SeqValue) TWO_16
          CROSS JOIN
              (SELECT 0 SeqValue
               UNION ALL SELECT 32 SeqValue) TWO_32
          HAVING SeqValue <=
              (SELECT max(week)
               FROM uptimes)
          AND SeqValue > 0) AS integers
     LEFT JOIN
         (SELECT Machine,
                 Area
          FROM uptimes
          GROUP BY 1,
                   2) AS machines ON 1=1
     LEFT JOIN uptimes ON uptimes.week = integers.SeqValue
     AND machines.Machine = uptimes.Machine
     WHERE uptimes.week IS NULL);

工作方式:

  1. 生成从1到表中最高周的整数(用并集选择)
  2. 从表中获取所有机器和区域(选择机器、区域…)
  3. 交叉连接二者以获得所有可能的组合(连接1=1)
  4. 过滤掉那些已经存在的(在哪里每周正常运行时间(为空)
  5. 将结果插入表(insert into)

另一个问题。尝试使用pandas plot函数。你知道吗

df = pd.read_excel("targetFolder.xlsx", sheetname = 0, sep ='|')
area1 = df[df.Area == 'A1']
area1.set_index('Week')['UTPercent'].plot(kind='bar')

相关问题 更多 >

    热门问题