问题:将int转换为ASCII并返回Python

我正在为我的站点制作URL缩短器,而我目前的计划(我愿意接受建议)是使用节点ID来生成缩短的URL。因此,从理论上讲,节点26可能是short.com/z,节点1可能是short.com/a,节点52可能是short.com/Z,节点104可能是short.com/ZZ。当用户转到该URL时,我需要撤消该过程(显然)。

我可以想到一些可行的方法来解决此问题,但我想还有更好的方法。有什么建议?

I’m working on making a URL shortener for my site, and my current plan (I’m open to suggestions) is to use a node ID to generate the shortened URL. So, in theory, node 26 might be short.com/z, node 1 might be short.com/a, node 52 might be short.com/Z, and node 104 might be short.com/ZZ. When a user goes to that URL, I need to reverse the process (obviously).

I can think of some kludgy ways to go about this, but I’m guessing there are better ones. Any suggestions?


回答 0

ASCII转换为int:

ord('a')

97

然后返回一个字符串:

  • 在Python2中: str(unichr(97))
  • 在Python3中: chr(97)

'a'

ASCII to int:

ord('a')

gives 97

And back to a string:

  • in Python2: str(unichr(97))
  • in Python3: chr(97)

gives 'a'


回答 1

>>> ord("a")
97
>>> chr(97)
'a'
>>> ord("a")
97
>>> chr(97)
'a'

回答 2

如果多个字符绑定在一个整数/长整数内,这就是我的问题:

s = '0123456789'
nchars = len(s)
# string to int or long. Type depends on nchars
x = sum(ord(s[byte])<<8*(nchars-byte-1) for byte in range(nchars))
# int or long to string
''.join(chr((x>>8*(nchars-byte-1))&0xFF) for byte in range(nchars))

Yield'0123456789'x = 227581098929683594426425L

If multiple characters are bound inside a single integer/long, as was my issue:

s = '0123456789'
nchars = len(s)
# string to int or long. Type depends on nchars
x = sum(ord(s[byte])<<8*(nchars-byte-1) for byte in range(nchars))
# int or long to string
''.join(chr((x>>8*(nchars-byte-1))&0xFF) for byte in range(nchars))

Yields '0123456789' and x = 227581098929683594426425L


回答 3

BASE58编码URL怎么样?像flickr这样。

# note the missing lowercase L and the zero etc.
BASE58 = '123456789abcdefghijkmnopqrstuvwxyzABCDEFGHJKLMNPQRSTUVWXYZ' 
url = ''
while node_id >= 58:
    div, mod = divmod(node_id, 58)
    url = BASE58[mod] + url
    node_id = int(div)

return 'http://short.com/%s' % BASE58[node_id] + url

将其转换为数字也没什么大不了的。

What about BASE58 encoding the URL? Like for example flickr does.

# note the missing lowercase L and the zero etc.
BASE58 = '123456789abcdefghijkmnopqrstuvwxyzABCDEFGHJKLMNPQRSTUVWXYZ' 
url = ''
while node_id >= 58:
    div, mod = divmod(node_id, 58)
    url = BASE58[mod] + url
    node_id = int(div)

return 'http://short.com/%s' % BASE58[node_id] + url

Turning that back into a number isn’t a big deal either.


回答 4

使用hex(id)[2:]int(urlpart, 16)。还有其他选择。对您的id进行base32编码也可以正常工作,但是我不知道有没有内置Python进行base32编码的库。

显然,在Python 2.4中使用base64模块引入了base32编码器。您可以尝试使用b32encodeb32decode。你应该给True两者的casefoldmap01期权b32decode的情况下,人们写下你的短网址。

实际上,我收回了这一点。我仍然认为base32编码是一个好主意,但是该模块对于URL缩短的情况没有用。您可以查看模块中的实现,并针对此特定情况进行自己的设计。:-)

Use hex(id)[2:] and int(urlpart, 16). There are other options. base32 encoding your id could work as well, but I don’t know that there’s any library that does base32 encoding built into Python.

Apparently a base32 encoder was introduced in Python 2.4 with the base64 module. You might try using b32encode and b32decode. You should give True for both the casefold and map01 options to b32decode in case people write down your shortened URLs.

Actually, I take that back. I still think base32 encoding is a good idea, but that module is not useful for the case of URL shortening. You could look at the implementation in the module and make your own for this specific case. :-)


声明:本站所有文章,如无特殊说明或标注,均为本站原创发布。任何个人或组织,在未征得本站同意时,禁止复制、盗用、采集、发布本站内容到任何网站、书籍等各类媒体平台。如若本站内容侵犯了原著者的合法权益,可联系我们进行处理。