当前位置: 首页 > 文档资料 > Python 文本处理 >

Text wrapping

优质
小牛编辑
131浏览
2023-12-01

从某些来源抓取的文本格式不正确无法在可用的屏幕宽度内显示时,需要进行文本换行。 这可以通过使用以下包来实现,该包可以使用以下命令安装在我们的环境中。

pip install parawrap

以下段落有一个连续的文本字符串。 在应用wrap函数时,我们可以看到文本如何分成多个用逗号分隔的行。

import parawrap
text = "In late summer 1945, guests are gathered for the wedding reception of Don Vito Corleone's daughter Connie (Talia Shire) and Carlo Rizzi (Gianni Russo). Vito (Marlon Brando), the head of the Corleone Mafia family, is known to friends and associates as Godfather. He and Tom Hagen (Robert Duvall), the Corleone family lawyer, are hearing requests for favors because, according to Italian tradition, no Sicilian can refuse a request on his daughter's wedding day. One of the men who asks the Don for a favor is Amerigo Bonasera, a successful mortician and acquaintance of the Don, whose daughter was brutally beaten by two young men because she refused their advances; the men received minimal punishment from the presiding judge. The Don is disappointed in Bonasera, who'd avoided most contact with the Don due to Corleone's nefarious business dealings. The Don's wife is godmother to Bonasera's shamed daughter, a relationship the Don uses to extract new loyalty from the undertaker. The Don agrees to have his men punish the young men responsible (in a non-lethal manner) in return for future service if necessary."
print parawrap.wrap(text)

当我们运行上面的程序时,我们得到以下输出 -

['In late summer 1945, guests are gathered for the wedding reception of', "Don Vito Corleone's daughter Connie (Talia Shire) and Carlo Rizzi", '(Gianni Russo). Vito (Marlon Brando), the head of the Corleone Mafia', 'family, is known to friends and associates as Godfather. He and Tom', 'Hagen (Robert Duvall), the Corleone family lawyer, are hearing', 'requests for favors because, according to Italian tradition, no', "Sicilian can refuse a request on his daughter's wedding day. One of", 'the men who asks the Don for a favor is Amerigo Bonasera, a successful', 'mortician and acquaintance of the Don, whose daughter was brutally', 'beaten by two young men because she refused their advances; the men', 'received minimal punishment from the presiding judge. The Don is', "disappointed in Bonasera, who'd avoided most contact with the Don due", "to Corleone's nefarious business dealings. The Don's wife is godmother", "to Bonasera's shamed daughter, a relationship the Don uses to extract", 'new loyalty from the undertaker. The Don agrees to have his men punish', 'the young men responsible (in a non-lethal manner) in return for', 'future service if necessary.']

我们还可以应用具有特定宽度的wrap函数作为输入参数,如果需要保持所需的wrap函数宽度,它将剪切单词。

import parawrap
text = "In late summer 1945, guests are gathered for the wedding reception of Don Vito Corleone's daughter Connie (Talia Shire) and Carlo Rizzi (Gianni Russo). Vito (Marlon Brando), the head of the Corleone Mafia family, is known to friends and associates as Godfather. He and Tom Hagen (Robert Duvall), the Corleone family lawyer, are hearing requests for favors because, according to Italian tradition, no Sicilian can refuse a request on his daughter's wedding day. One of the men who asks the Don for a favor is Amerigo Bonasera, a successful mortician and acquaintance of the Don, whose daughter was brutally beaten by two young men because she refused their advances; the men received minimal punishment from the presiding judge. The Don is disappointed in Bonasera, who'd avoided most contact with the Don due to Corleone's nefarious business dealings. The Don's wife is godmother to Bonasera's shamed daughter, a relationship the Don uses to extract new loyalty from the undertaker. The Don agrees to have his men punish the young men responsible (in a non-lethal manner) in return for future service if necessary."
print parawrap.wrap(text,5)

当我们运行上面的程序时,我们得到以下输出 -

['In', 'late ', 'summe', 'r', '1945,', 'guest', 's are', 'gathe', 'red', 'for', 'the w', 'eddin', 'g rec', 'eptio', 'n of', 'Don', 'Vito ', 'Corle', "one's", 'daugh', 'ter C', 'onnie', '(Tali', 'a Shi', 're)', 'and', 'Carlo', 'Rizzi', '(Gian', 'ni Ru', 'sso).', 'Vito ', '(Marl', 'on Br', 'ando)', ', the', 'head', 'of', 'the C', 'orleo', 'ne', 'Mafia', 'famil', 'y, is', 'known', 'to fr', 'iends', 'and a', 'ssoci', 'ates', 'as Go', 'dfath', 'er.', 'He', 'and', 'Tom', 'Hagen', '(Robe', 'rt Du', 'vall)', ', the', 'Corle', 'one f', 'amily', 'lawye', 'r,', 'are h', 'earin', 'g req', 'uests', 'for f', 'avors', 'becau', 'se, a', 'ccord', 'ing', 'to It', 'alian', 'tradi', 'tion,', 'no Si', 'cilia', 'n can', 'refus', 'e a r', 'eques', 't on', 'his d', 'aught', "er's ", 'weddi', 'ng', 'day.', 'One', 'of', 'the', 'men', 'who', 'asks', 'the', 'Don', 'for a', 'favor', 'is Am', 'erigo', 'Bonas', 'era,', 'a suc', 'cessf', 'ul mo', 'rtici', 'an', 'and a', 'cquai', 'ntanc', 'e of', 'the', 'Don,', 'whose', 'daugh', 'ter', 'was b', 'rutal', 'ly be', 'aten', 'by', 'two', 'young', 'men b', 'ecaus', 'e she', 'refus', 'ed', 'their', 'advan', 'ces;', 'the', 'men r', 'eceiv', 'ed mi', 'nimal', 'punis', 'hment', 'from', 'the p', 'resid', 'ing j', 'udge.', 'The', 'Don', 'is di', 'sappo', 'inted', 'in Bo', 'naser', 'a,', "who'd", 'avoid', 'ed', 'most ', 'conta', 'ct', 'with', 'the', 'Don', 'due', 'to Co', 'rleon', "e's n", 'efari', 'ous b', 'usine', 'ss de', 'aling', 's.', 'The', "Don's", 'wife', 'is go', 'dmoth', 'er to', 'Bonas', "era's", 'shame', 'd dau', 'ghter', ', a r', 'elati', 'onshi', 'p the', 'Don', 'uses', 'to ex', 'tract', 'new l', 'oyalt', 'y', 'from', 'the u', 'ndert', 'aker.', 'The', 'Don a', 'grees', 'to', 'have', 'his', 'men p', 'unish', 'the', 'young', 'men r', 'espon', 'sible', '(in a', 'non-l', 'ethal', 'manne', 'r) in', 'retur', 'n for', 'futur', 'e ser', 'vice', 'if ne', 'cessa', 'ry.']