python整数、字符串、字节串

一、整数、字符串、字节串之间的相互转换

1.进制转换

10进制转16进制(注意转换出来的是16进制字符串):

hex(16)  ==>  0x10

16进制转10进制:

int(STRING,BASE)将字符串STRING转成十进制int，其中STRING的基是base。该函数的第一个参数是字符串

int('0x10', 16)  ==>  16

类似的还有八进制oct()，二进制bin()

16进制字符串转成二进制

hex_str='00fe'
bin(int('1'+hex_str, 16))[3:]  #含有前导0
# 结果 '0000000011111110'
bin(int(hex_str, 16))[2:]   #忽略前导0
# 结果 '11111110'

二进制字符串转成16进制字符串

bin_str='0b0111000011001100'
hex(int(bin_str,2))
# 结果 '0x70cc'

2.字符to整数

10进制字符串:

int('10')  ==>  10

16进制字符串:

int('10', 16)  ==>  16
# 或者
int('0x10', 16)  ==>  16

3.字节串to整数

使用网络数据包常用的struct，兼容C语言的数据结构
struct中支持的格式如下表

Format	C-Type	Python-Type	字节数	备注
x	pad byte	no value	1
c	char	string of length 1	1
b	signed char	integer	1
B	unsigned char	integer	1
?	_Bool	bool	1
h	short	integer	2
H	unsigned short	integer	2
i	int	integer	4
I	unsigned int	integer or long	4
l	long	integer	4
L	unsigned long	long	4
q	long long	long	8	仅支持64bit机器
Q	unsigned long long	long	8	仅支持64bit机器
f	float	float	4
d	double	float	8
s	char[]	string	1
p	char[]	string	1(与机器有关)	作为指针
P	void *	long	4	作为指针

对齐方式：放在第一个fmt位置(struct包的使用)

CHARACTER	BYTE ORDER	SIZE	ALIGNMENT
@	native	native	native
=	native	standard	none
<	little-endian	standard	none
>	big-endian	standard	none
!	network (= big-endian)	standard	none

pythonstruct包源码：

def pack(fmt, *args): # known case of _struct.pack
    """ Return string containing values v1, v2, ... packed according to fmt. """
    return ""
  
def unpack(fmt, string): # known case of _struct.unpack
    """
    Unpack the string containing packed C structure data, according to fmt.
    Requires len(string) == calcsize(fmt).
    """
    pass

# 第一个参数用上述表格内的fmt格式表示字符串
# 其他参数*args为自己要进行转为字节串的值

转义为short型整数:

struct.unpack('<hh', bytes(b'\x01\x00\x00\x00'))  ==>  (1, 0)
# fmt的"<hh"  代表：little-endian(寄存器数字的存储方式，小的在前)，两个双字节的整型integer(一个转成两个数，每个占了两个字节，不足位置补的是0)

转义为long型整数:

struct.unpack('<L', bytes(b'\x01\x00\x00\x00'))  ==>  (1,)
# 转成一个四字节的整型long

4.整数to字节串

转为两个字节:

struct.pack('<HH', 1,2)  ==>  b'\x01\x00\x02\x00'
# 将两个值1和2，转换为双字节表示的字节串。(如：1 >>> b'\x01\x00'  2 >>> b'\x02\x00')

转为四个字节:

struct.pack('<LL', 1,2)  ==>  b'\x01\x00\x00\x00\x02\x00\x00\x00'
# 将两个值1和2，转换为四字节表示的字节串(如：整数1可表示为 b'\x01\x00\x00\x00')

5.整数to字符串

直接用函数

str(100)

6.字符串to字节串

decode和encode区别

decode函数是重新解码，把CT字符串所显示的69dda8455c7dd425【每隔两个字符】解码成十六进制字符\x69\xdd\xa8\x45\x5c\x7d\xd4\x25

CT = '69dda8455c7dd425'
print "%r"%CT.decode('hex')

encode函数是重新编码，把CT字符串所显示的69dda8455c7dd425【每个字符】编码成acsii值，ascii值为十六进制显示，占两位。执行下列结果显示36396464613834353563376464343235等价于将CT第一个字符’6’编码为0x36h 第二个字符’9’编码为0x39h

CT='69dda8455c7dd425'
print "%r"%CT.encode('hex')

可以理解为：decode解码，字符串变短一半，encode编码，字符串变为两倍长度

python2和python3编码有区别，这里注意区分

decode(‘ascii’)解码为字符串Unicode格式。输出带有’u’
encode(‘ascii’)，编码为Unicode格式，其实python默认处理字符串存储就是Unicode，输出结果估计和原来的字符串一样。

字符串编码为字节码:

'12abc'.encode('ascii')  ==>  b'12abc'

数字或字符数组:

bytes([1,2, ord('1'),ord('2')])  ==>  b'\x01\x0212'

16进制字符串:

bytes().fromhex('010210')  ==>  b'\x01\x02\x10'

16进制字符串:

bytes(map(ord, '\x01\x02\x31\x32'))  ==>  b'\x01\x0212'

16进制数组:

bytes([0x01,0x02,0x31,0x32])  ==>  b'\x01\x0212'

7.字节串to字符串

字节码解码为字符串:

bytes(b'\x31\x32\x61\x62').decode('ascii')  ==>  12ab

字节串转16进制表示,夹带ascii:

str(bytes(b'\x01\x0212'))[2:-1]  ==>  \x01\x0212

字节串转16进制表示,固定两个字符表示:

str(binascii.b2a_hex(b'\x01\x0212'))[2:-1]  ==>  01023132

字节串转16进制数组:

[hex(x) for x in bytes(b'\x01\x0212')]  ==>  ['0x1', '0x2', '0x31', '0x32']

问题：什么时候字符串前面加上’r’、’b’、’r’，其实官方文档有写。我认为在Python2中，r和b是等效的。

The Python 2.x documentation:

A prefix of ‘b’ or ‘B’ is ignored in Python 2; it indicates that the literal should become a bytes literal in Python 3 (e.g. when code is automatically converted with 2to3). A ‘u’ or ‘b’ prefix may be followed by an ‘r’ prefix.
‘b’字符加在字符串前面，对于python2会被忽略。加上’b’目的仅仅为了兼容python3，让python3以bytes数据类型(0~255)存放这个字符、字符串。

The Python 3.3 documentation states:

Bytes literals are always prefixed with ‘b’ or ‘B’; they produce an instance of the bytes type instead of the str type. They may only contain ASCII characters; bytes with a numeric value of 128 or greater must be expressed with escapes.
数据类型byte总是以’b’为前缀，该数据类型仅为ascii。

下面是stackflow上面一个回答。我觉得不错，拿出来跟大家分享

In Python 2.x

Pre-3.0 versions of Python lacked this kind of distinction between text and binary data. Instead, there was:

unicode = u’…’ literals = sequence of Unicode characters = 3.x str

str = ‘…’ literals = sequences of confounded bytes/characters
Usually text, encoded in some unspecified encoding.
But also used to represent binary data like struct.pack output.

Python 3.x makes a clear distinction between the types:

str = ‘…’ literals = a sequence of Unicode characters (UTF-16 or UTF-32, depending on how Python was compiled)

bytes = b’…’ literals = a sequence of octets (integers between 0 and 255)

二、 Python字节串详解

1.字节串概念理解

存储以字节为单位的数据
字节串是不可变的字节序列
字节是 0~255 之间的整数

2.创建字节串

# 创建空字节串的字面值
b'' 
b""
b''''''
b""""""
B''
B""
B''''''
B""""""
# 创建非空字节串的字面值
b'ABCD'
b'\x41\x42'
b'hello Jason'

3.字节串的构造函数

bytes() 生成一个空的字节串等同于 b''
bytes(整型组成的可迭代对象) 用可迭代对象初始化一个字节串,不能超过255
bytes(整数n) 生成 n 个值为零的字节串
bytes(字符串, encoding='utf-8') 用字符串的转换编码生成一个字节串

a = bytes()  # b''
b = bytes([10,20,30,65,66,67])  # b'\n\x14\x1eABC'
c = bytes(range(65,65+26))  # b'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
d = bytes(5)  # b'\x00\x00\x00\x00\x00'
e = bytes('hello 中国','utf-8')  # b'hello \xe4\xb8\xad\xe5\x9b\xbd'

4.字节串的运算

+ += * *=
< <= > >= == !=
in / not in 只能对整型数进行操作
索引/切片

b = b'abc' + b'123'  # b=b'abc123'
b += b'ABC'    # b=b'abc123ABC'
b'ABD' > b'ABC'  # True
b'a'*4   # b'aaaa'
b = b'ABCD'
65 in b    # True
b'A' in b  # True

5.`bytes` 与 `str` 的区别

bytes 存储字节(0-255)
str 存储Unicode字符(0-65535)

6.`bytes` 与 `str` 转换

str 转 bytes
b = s.encode('utf-8')
bytes 转 str
s = b.decode('utf-8')

三、字节数组

可变的字节序列

1.创建字构造函数

bytearray() 生成一个空的字节串等同于 bytearray(b'')
bytearray(整型可迭代对象) 用可迭代对象初始化一个字节串,不能超过255
bytearray(整数n) 生成 n 个值为零的字节串
bytearray(字符串, encoding='utf-8') 用字符串的转换编码生成一个字节串

a = bytearray() # bytearray(b'')
b = bytearray(5) # bytearray(b'\x00\x00\x00\x00\x00')
c = bytearray([1,2,3,4]) # bytearray(b'\x01\x02\x03\x04')

2.字节数组的运算

+ += * *=
< <= > >= == !=
in / not in 只能对整型数进行操作
索引/切片

3.字节数组的方法

B.clear() 清空字节数组
B.append(n) 追加一个字节(n为0-255的整数)
B.remove(value) 删除第一个出现的字节，如果没有出现，则产生ValueError错误
B.reverse() 字节的顺序进行反转
B.decode(encoding='utf-8') # 解码
B.find(sub[, start[, end]]) 查找

鸣谢：(https://www.kancloud.cn/hx78/python/450119)
(https://www.jianshu.com/p/5bb986772ef8)

python整数、字符串、字节串

python整数、字符串、字节串

文章目录

一、整数、字符串、字节串之间的相互转换

1.进制转换

2.字符to整数

3.字节串to整数

4.整数to字节串

5.整数to字符串

6.字符串to字节串

7.字节串to字符串

二、 Python字节串详解

1.字节串概念理解

2.创建字节串

3.字节串的构造函数

4.字节串的运算

5.`bytes` 与 `str` 的区别

6.`bytes` 与 `str` 转换

三、字节数组

1.创建字构造函数

2.字节数组的运算

3.字节数组的方法

[转帖]使用NMT和pmap解决JVM资源泄漏问题原创

Python实现大麦网抢票的四大关键技术点解析

Python 安装库指令大全

salesforce零基础学习（一百三十八）零碎知识点小总结（十）

一款开源的.NET程序集反编译、编辑和调试神器

关于接口协议，你必须要知道这些！

2020年上半年数据库系统工程师考试

基于 Milvus + LlamaIndex 实现高级 RAG

【2024-05-21】以茶会友

爬蟲知識梳理

01-Django REST framwork 板塊( 01-REST規範)

docker-compose 配置

03-Django REST framwork 板塊(03-認證、權限、節流)

Django中models字段含義、用法

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結

python整数、字符串、字节串

python整数、字符串、字节串

文章目录

一、整数、字符串、字节串之间的相互转换

1.进制转换

2.字符to整数

3.字节串to整数

4.整数to字节串

5.整数to字符串

6.字符串to字节串

7.字节串to字符串

二、 Python字节串详解

1.字节串概念理解

2.创建字节串

3.字节串的构造函数

4.字节串的运算

5.bytes 与 str 的区别

6.bytes 与 str 转换

三、 字节数组

1.创建字构造函数

2.字节数组的运算

3.字节数组的方法

5.`bytes` 与 `str` 的区别

6.`bytes` 与 `str` 转换

三、字节数组