题目
1133. SPAM
Constraints
Time Limit: 10 secs, Memory Limit: 32 MB
Description
You never had any friends, and don’t really want any anyways, and so you have decided to collect email addresses from web pages for direct e-mail advertising.
The text delivered to a web browser is usually marked up HTML, which may contain email addresses of the form:
user@server
¨ Both user and server are of the form alpha.numeric.with.dots. By alpha.numeric.with.dots, we mean a sequence of one or more characters which are alphabetic (A-Z,a-z), numeric (0-9), hyphens (-), underbars (_) and/or periods (.), with the following restrictions on periods:
n The sequence neither starts nor ends with a period.
n No periods are adjacent.
¨ Email addresses are preceded by the beginning of the file, or some character other than a letter (A-Z,a-z), digit (0-9), hyphen (-), or underbar (_).
¨ Email addresses are succeeded by the end of the file, or some character other than a letter (A-Z,a-z), digit (0-9), hyphen (-), or underbar (_).
¨ If the scanned text contains a sequence of the form
first@second@third
Then the output should contain first@second and second@third as email addresses. In a longer run, each pair split by an @-sign should appear as an email address in the output.
The point of this problem is to extract and record the email addresses embedded in other text.
Input
The input file will contain zero or more lines of ASCII text.
Output
Other than the standard leader and trailer, the output file has each email address found in the input file in the order it was found (duplicates not removed).
Sample Input
bob@banks.com wrote:
What does x=7 mean for this problem? For
example,
..a@a@aa@aaa@aaa..a@a@aa@aaa@aaa..a@a..@a…a@..@..
this scrolling @-example from jim@jones.comSample Output
bob@banks.com
a@a
a@aa
aa@aaa
aaa@aaa
a@a
a@aa
aa@aaa
aaa@aaa
a@a
jim@jones.com
思路
要把@两边的合法字符收集起来构成邮箱地址,可以直接调用string的find,但是实际上不好做
直接用原始逐个查找,找到@后再向两侧检索合法部分就行了
找合法部分可以用一个函数简化,把除了合法字符之外的字符先筛选掉,再处理各种情况
对于找到的合法部分
由于“..”情况比较多,尤其是在两侧的时候,所以要非常小心,先把“..”刚好在合法部分最前或最后的情况解决
要控制当前字符是第二个后或倒数第二个前之间检测,避免访问数组外的地址
之后再把两端单个的“.”去掉
最后做一些简单的筛选就能输出了
代码
#include "stdafx.h"
bool valid_char(char c) {
if ((c >= '0'&&c <= '9') || (c >= 'a'&&c <= 'z') || (c >= 'A'&&c <= 'Z') || c == '-' || c == '_' || c == '.')
return true;
else
return false;
}
int main() {
string spam;
while (getline(cin, spam)) {
for (int i = 0; i < spam.size(); i++) {
if (spam[i] == '@') {
int pre_pos=i, aft_pos=i;
for (int j = i-1; j >= 0; j--) {
if (valid_char(spam[j])) {
if (j >= 1 && spam[j] == '.'&&spam[j - 1] == '.') {
pre_pos = j + 1;
break;
}
else
pre_pos = j;
}
else
break;
}
for (int j = i + 1; j < spam.size(); j++) {
if (valid_char(spam[j])) {
if (j <= spam.size() - 2 && spam[j] == '.'&&spam[j + 1] == '.') {
aft_pos = j-1;
break;
}
else
aft_pos = j;
}
else
break;
}
if (spam[pre_pos] == '.') pre_pos++;
if (spam[aft_pos] == '.') aft_pos--;
if (pre_pos >= i || aft_pos <= i) continue;
for (int j = pre_pos; j <= aft_pos; j++)
cout << spam[j];
cout << endl;
}
}
}
}