标签归档:data-conversion

为什么〜True导致-2?

问题:为什么〜True导致-2?

在Python控制台中:

~True

给我:

-2

为什么?有人可以用二进制给我解释这个特殊情况吗?

In Python console:

~True

Gives me:

-2

Why? Can someone explain this particular case to me in binary?


回答 0

int(True)1

1 是:

00000001

并且~1是:

11111110

这是-2两个补1

1翻转所有位,在结果数上加1,然后将结果解释为幅度的二进制表示,并添加一个负号(因为数字以1开头):

11111110  00000001  00000010 
                    
       Flip       Add 1

它是2,但由于MSB为1 ,所以符号为负。


值得一提:

想一想bool,您会发现它本质上是数字-它有两个值TrueFalse,它们只是整数1和0的“自定义”版本,它们的打印方式不同。它们是整数类型的子类int

因此它们的行为与1和0完全相同,只是bool重新定义strrepr以不同的方式显示它们。

>>> type(True)
<class 'bool'>
>>> isinstance(True, int)
True

>>> True == 1
True
>>> True is 1  # they're still different objects
False

int(True) is 1.

1 is:

00000001

and ~1 is:

11111110

Which is -2 in Two’s complement1

1 Flip all the bits, add 1 to the resulting number and interpret the result as a binary representation of the magnitude and add a negative sign (since the number begins with 1):

11111110 → 00000001 → 00000010 
         ↑          ↑ 
       Flip       Add 1

Which is 2, but the sign is negative since the MSB is 1.


Worth mentioning:

Think about bool, you’ll find that it’s numeric in nature – It has two values, True and False, and they are just “customized” versions of the integers 1 and 0 that only print themselves differently. They are subclasses of the integer type int.

So they behave exactly as 1 and 0, except that bool redefines str and repr to display them differently.

>>> type(True)
<class 'bool'>
>>> isinstance(True, int)
True

>>> True == 1
True
>>> True is 1  # they're still different objects
False

回答 1

Python bool类型是的子类int(出于历史原因;布尔值仅在Python 2.3中添加)。

由于int(True)1~True~1-2

有关为什么是的子类,请参见PEP 285boolint

如果需要布尔逆,请使用not

>>> not True
False
>>> not False
True

如果您想知道为什么这样~1-2,那是因为您正在反转一个有符号整数中的所有位;00000001变为1111110符号整数中的一个负数,请参见二进制补码

>>> # Python 3
...
>>> import struct
>>> format(struct.pack('b', 1)[0], '08b')
'00000001'
>>> format(struct.pack('b', ~1)[0], '08b')
'11111110'

其中起始1位表示该值为负,其余位则对正数减去一的值进行反编码。

The Python bool type is a subclass of int (for historical reasons; booleans were only added in Python 2.3).

Since int(True) is 1, ~True is ~1 is -2.

See PEP 285 for why bool is a subclass of int.

If you wanted the boolean inverse, use not:

>>> not True
False
>>> not False
True

If you wanted to know why ~1 is -2, it’s because you are inverting all bits in a signed integer; 00000001 becomes 1111110 which in a signed integer is a negative number, see Two’s complement:

>>> # Python 3
...
>>> import struct
>>> format(struct.pack('b', 1)[0], '08b')
'00000001'
>>> format(struct.pack('b', ~1)[0], '08b')
'11111110'

where the initial 1 bit means the value is negative, and the rest of the bits encode the inverse of the positive number minus one.


回答 2

~True == -2,如果不奇怪 True的手段1 ~方式按位反转

只要


编辑:

  • 修复整数表示和按位求反运算符之间的混合
  • 进行另一次抛光(信息越短,需要做的工作越多)

~True == -2 is not surprising if True means 1 and ~ means bitwise inversion

provided that


Edits:

  • fixed the mixing between integer representation and bitwise inversion operator
  • applied another polishing (the shorter the message, the more work needed)

python pandas dataframe列转换为dict键和值

问题:python pandas dataframe列转换为dict键和值

我有一个带有多列的pandas数据框,我想从两列构造一个dict:一个作为dict的键,另一个作为dict的值。我怎样才能做到这一点?

数据框:

           area  count
co tp
DE Lake      10      7
Forest       20      5
FR Lake      30      2
Forest       40      3

我需要将区域定义为键,在dict中计为值。先感谢您。

I have a pandas data frame with multiple columns and I would like to construct a dict from two columns: one as the dict’s keys and the other as the dict’s values. How can I do that?

Dataframe:

           area  count
co tp
DE Lake      10      7
Forest       20      5
FR Lake      30      2
Forest       40      3

I need to define area as key, count as value in dict. Thank you in advance.


回答 0

如果lakes是您DataFrame,则可以执行以下操作

area_dict = dict(zip(lakes.area, lakes.count))

If lakes is your DataFrame, you can do something like

area_dict = dict(zip(lakes.area, lakes.count))

回答 1

使用大熊猫可以做到:

如果lakes是您的DataFrame:

area_dict = lakes.to_dict('records')

With pandas it can be done as:

If lakes is your DataFrame:

area_dict = lakes.to_dict('records')

回答 2

如果您想和熊猫玩耍,也可以这样做。但是,我喜欢punchagan的方式。

# replicating your dataframe
lake = pd.DataFrame({'co tp': ['DE Lake', 'Forest', 'FR Lake', 'Forest'], 
                 'area': [10, 20, 30, 40], 
                 'count': [7, 5, 2, 3]})
lake.set_index('co tp', inplace=True)

# to get key value using pandas
area_dict = lake.set_index('area').T.to_dict('records')[0]
print(area_dict)

output: {10: 7, 20: 5, 30: 2, 40: 3}

You can also do this if you want to play around with pandas. However, I like punchagan’s way.

# replicating your dataframe
lake = pd.DataFrame({'co tp': ['DE Lake', 'Forest', 'FR Lake', 'Forest'], 
                 'area': [10, 20, 30, 40], 
                 'count': [7, 5, 2, 3]})
lake.set_index('co tp', inplace=True)

# to get key value using pandas
area_dict = lake.set_index('area').T.to_dict('records')[0]
print(area_dict)

output: {10: 7, 20: 5, 30: 2, 40: 3}

如何将此字典列表转换为csv文件?

问题:如何将此字典列表转换为csv文件?

我有一个字典列表,看起来像这样:

toCSV = [{'name':'bob','age':25,'weight':200},{'name':'jim','age':31,'weight':180}]

我应该怎么做才能将其转换为如下所示的csv文件:

name,age,weight
bob,25,200
jim,31,180

I have a list of dictionaries that looks something like this:

toCSV = [{'name':'bob','age':25,'weight':200},{'name':'jim','age':31,'weight':180}]

What should I do to convert this to a csv file that looks something like this:

name,age,weight
bob,25,200
jim,31,180

回答 0

import csv
toCSV = [{'name':'bob','age':25,'weight':200},
         {'name':'jim','age':31,'weight':180}]
keys = toCSV[0].keys()
with open('people.csv', 'wb') as output_file:
    dict_writer = csv.DictWriter(output_file, keys)
    dict_writer.writeheader()
    dict_writer.writerows(toCSV)

编辑:我以前的解决方案不处理订单。正如Wilduck所指出的,此处DictWriter更合适。

import csv
toCSV = [{'name':'bob','age':25,'weight':200},
         {'name':'jim','age':31,'weight':180}]
keys = toCSV[0].keys()
with open('people.csv', 'w', newline='')  as output_file:
    dict_writer = csv.DictWriter(output_file, keys)
    dict_writer.writeheader()
    dict_writer.writerows(toCSV)

EDIT: My prior solution doesn’t handle the order. As noted by Wilduck, DictWriter is more appropriate here.


回答 1

这是当您有一个词典列表时:

import csv
with open('names.csv', 'w') as csvfile:
    fieldnames = ['first_name', 'last_name']
    writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
    writer.writeheader()
    writer.writerow({'first_name': 'Baked', 'last_name': 'Beans'})

this is when you have one dictionary list:

import csv
with open('names.csv', 'w') as csvfile:
    fieldnames = ['first_name', 'last_name']
    writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
    writer.writeheader()
    writer.writerow({'first_name': 'Baked', 'last_name': 'Beans'})

回答 2

在python 3中,情况有所不同,但是方式更简单,错误更少。最好告诉CSV文件应使用utf8编码打开,因为这会使数据更易于其他人使用(假设您未使用限制性更强的编码,例如latin1

import csv
toCSV = [{'name':'bob','age':25,'weight':200},
         {'name':'jim','age':31,'weight':180}]
with open('people.csv', 'w', encoding='utf8', newline='') as output_file:
    fc = csv.DictWriter(output_file, 
                        fieldnames=toCSV[0].keys(),

                       )
    fc.writeheader()
    fc.writerows(toCSV)
  • 请注意,csv在python 3中需要该newline=''参数,否则在excel / opencalc中打开时,CSV中会出现空白行。

或者:我更喜欢在pandas模块中使用csv处理程序。我发现它对编码问题的容忍度更高,并且熊猫在加载文件时会自动将CSV中的字符串数字转换为正确的类型(int,float等)。

import pandas
dataframe = pandas.read_csv(filepath)
list_of_dictionaries = dataframe.to_dict('records')
dataframe.to_csv(filepath)

注意:

  • 如果您提供路径,pandas将为您打开文件,并且默认为 utf8 python3中的名称,并且找出标头。
  • 数据框的结构与CSV所提供的结构不同,因此在加载时添加一行即可得到相同的结果: dataframe.to_dict('records')
  • 熊猫还使控制csv文件中列的顺序变得更加容易。默认情况下,它们是字母顺序的,但是您可以指定列顺序。使用香草csv模块,您需要将其喂入,OrderedDict否则它们将以随机顺序出现(如果在python <3.5中工作)。有关更多信息,请参见:在Python Pandas DataFrame中保留列顺序

In python 3 things are a little different, but way simpler and less error prone. It’s a good idea to tell the CSV your file should be opened with utf8 encoding, as it makes that data more portable to others (assuming you aren’t using a more restrictive encoding, like latin1)

import csv
toCSV = [{'name':'bob','age':25,'weight':200},
         {'name':'jim','age':31,'weight':180}]
with open('people.csv', 'w', encoding='utf8', newline='') as output_file:
    fc = csv.DictWriter(output_file, 
                        fieldnames=toCSV[0].keys(),

                       )
    fc.writeheader()
    fc.writerows(toCSV)
  • Note that csv in python 3 needs the newline='' parameter, otherwise you get blank lines in your CSV when opening in excel/opencalc.

Alternatively: I prefer use to the csv handler in the pandas module. I find it is more tolerant of encoding issues, and pandas will automatically convert string numbers in CSVs into the correct type (int,float,etc) when loading the file.

import pandas
dataframe = pandas.read_csv(filepath)
list_of_dictionaries = dataframe.to_dict('records')
dataframe.to_csv(filepath)

Note:

  • pandas will take care of opening the file for you if you give it a path, and will default to utf8 in python3, and figure out headers too.
  • a dataframe is not the same structure as what CSV gives you, so you add one line upon loading to get the same thing: dataframe.to_dict('records')
  • pandas also makes it much easier to control the order of columns in your csv file. By default, they’re alphabetical, but you can specify the column order. With vanilla csv module, you need to feed it an OrderedDict or they’ll appear in a random order (if working in python < 3.5). See: Preserving column order in Python Pandas DataFrame for more.

回答 3

因为@User和@BiXiC在这里寻求UTF-8的帮助,所以@Matthew提供了解决方案的变体。(不允许发表评论,所以我在回答。)

import unicodecsv as csv
toCSV = [{'name':'bob','age':25,'weight':200},
         {'name':'jim','age':31,'weight':180}]
keys = toCSV[0].keys()
with open('people.csv', 'wb') as output_file:
    dict_writer = csv.DictWriter(output_file, keys)
    dict_writer.writeheader()
    dict_writer.writerows(toCSV)

Because @User and @BiXiC asked for help with UTF-8 here a variation of the solution by @Matthew. (I’m not allowed to comment, so I’m answering.)

import unicodecsv as csv
toCSV = [{'name':'bob','age':25,'weight':200},
         {'name':'jim','age':31,'weight':180}]
keys = toCSV[0].keys()
with open('people.csv', 'wb') as output_file:
    dict_writer = csv.DictWriter(output_file, keys)
    dict_writer.writeheader()
    dict_writer.writerows(toCSV)

回答 4

import csv

with open('file_name.csv', 'w') as csv_file:
    writer = csv.writer(csv_file)
    writer.writerow(('colum1', 'colum2', 'colum3'))
    for key, value in dictionary.items():
        writer.writerow([key, value[0], value[1]])

这是将数据写入.csv文件的最简单方法

import csv

with open('file_name.csv', 'w') as csv_file:
    writer = csv.writer(csv_file)
    writer.writerow(('colum1', 'colum2', 'colum3'))
    for key, value in dictionary.items():
        writer.writerow([key, value[0], value[1]])

This would be the simplest way to write data to .csv file


回答 5

这是另一个更通用的解决方案,假设您没有行列表(也许它们不适合内存)或标题的副本(也许write_csv函数是通用的):

def gen_rows():
    yield OrderedDict(name='bob', age=25, weight=200)
    yield OrderedDict(name='jim', age=31, weight=180)

def write_csv():
    it = genrows()
    first_row = it.next()  # __next__ in py3
    with open("people.csv", "w") as outfile:
        wr = csv.DictWriter(outfile, fieldnames=list(first_row))
        wr.writeheader()
        wr.writerow(first_row)
        wr.writerows(it)

注意:这里使用的OrderedDict构造函数仅在python> 3.4中保留顺序。如果订单很重要,请使用OrderedDict([('name', 'bob'),('age',25)])表格。

Here is another, more general solution assuming you don’t have a list of rows (maybe they don’t fit in memory) or a copy of the headers (maybe the write_csv function is generic):

def gen_rows():
    yield OrderedDict(name='bob', age=25, weight=200)
    yield OrderedDict(name='jim', age=31, weight=180)

def write_csv():
    it = genrows()
    first_row = it.next()  # __next__ in py3
    with open("people.csv", "w") as outfile:
        wr = csv.DictWriter(outfile, fieldnames=list(first_row))
        wr.writeheader()
        wr.writerow(first_row)
        wr.writerows(it)

Note: the OrderedDict constructor used here only preserves order in python >3.4. If order is important, use the OrderedDict([('name', 'bob'),('age',25)]) form.


回答 6

import csv
toCSV = [{'name':'bob','age':25,'weight':200},
         {'name':'jim','age':31,'weight':180}]
header=['name','age','weight']     
try:
   with open('output'+str(date.today())+'.csv',mode='w',encoding='utf8',newline='') as output_to_csv:
       dict_csv_writer = csv.DictWriter(output_to_csv, fieldnames=header,dialect='excel')
       dict_csv_writer.writeheader()
       dict_csv_writer.writerows(toCSV)
   print('\nData exported to csv succesfully and sample data')
except IOError as io:
    print('\n',io)
import csv
toCSV = [{'name':'bob','age':25,'weight':200},
         {'name':'jim','age':31,'weight':180}]
header=['name','age','weight']     
try:
   with open('output'+str(date.today())+'.csv',mode='w',encoding='utf8',newline='') as output_to_csv:
       dict_csv_writer = csv.DictWriter(output_to_csv, fieldnames=header,dialect='excel')
       dict_csv_writer.writeheader()
       dict_csv_writer.writerows(toCSV)
   print('\nData exported to csv succesfully and sample data')
except IOError as io:
    print('\n',io)