graphql
by Michael Hunger
迈克尔·汉格(Michael Hunger)
A while back, I came across Peggy’s Twitter request:
不久前,我遇到了Peggy的Twitter请求:
Which got a really cool response from Bonnie that warmed my heart:
邦妮(Bonnie)的回应很酷,这让我很振奋:
And triggered an idea …
并引发了一个主意……
As you might know, we’re having a lot of fun showing the impressive engagement of developers in their communities (for example GraphQL, Neo4j, …) in a single place by importing them into a “Community Graph.” Usually, it is really hard to follow the flurry of activity on Twitter, Slack, StackOverflow, GitHub, and so on to keep on top of whats happening. Especially if your community is growing quickly.
如您所知,通过将开发人员导入“社区图”,可以在一个地方展示开发人员在其社区中的出色参与(例如GraphQL,Neo4j等),我们从中获得了很多乐趣。 通常,要跟踪最新情况,真的很难跟踪Twitter,Slack,StackOverflow,GitHub等上的大量活动。 特别是在您的社区发展Swift的情况下。
So we scratched an itch and are continuously importing (via AWS Lambda) the activity for the Neo4j community into a single graph, which can then be queried and visualized — and which is accessible here.
因此,我们付出了巨大的努力,并不断(通过AWS Lambda)将Neo4j社区的活动导入到单个图表中,然后可以对其进行查询和可视化- 可以在此处进行访问 。
We did the same for the GraphQL community, as their data is also accessible via GraphiQL and documented here.
我们对GraphQL社区也做了同样的事情 ,因为它们的数据也可以通过GraphiQL访问并在此处进行了记录 。
So as we had all the activities of GraphQL from the last few months in our graph database, I thought it would be cool to use it to answer Peggy’s request.
因此,由于我们在图形数据库中拥有过去几个月的所有GraphQL活动,所以我认为用它来回答Peggy的请求很酷。
You can access the read-only data here: http://107.170.69.23:7474/browser/ using “graphql” as username and password.
您可以在此处访问只读数据: http : //107.170.69.23 : 7474/browser/,使用“ graphql ”作为用户名和密码。
Let’s see if we can find one of the women (Bonnie Brennan) active in Peggy’s Twitter thread, who’s tweeting about GraphQL, and show her tweets and their tags.
让我们看看是否可以在Peggy的Twitter帖子中找到活跃的女性之一(Bonnie Brennan),他们在发关于GraphQL的推文,并显示她的推文和标签。
We’re using Neo4j’s query language Cypher here, matching the ASCII-art pattern of “a user posting tweets tagged with these tags” and then binding the user’s screen_name to ‘bonnster75’ and returning everything we found.
我们在这里使用Neo4j的查询语言Cypher , 匹配 ASCII艺术形式的“用户发布带有这些标签标记的推文”,然后将用户的screen_name绑定到“ bonnster75”并返回我们找到的所有内容。
MATCH (user:Twitter:User)-[:POSTED]->(t:Tweet)-[:TAGGED]->(tag:Tag)WHERE user.screen_name = 'bonnster75'RETURN *
One simple way to predict gender is to look at the first name. I know that it is far from reliable, but we’re looking only for suggestions that we’ll check manually later. Then the power of the network can reveal further candidates.
预测性别的一种简单方法是查看名字。 我知道这远非可靠,但我们只是在寻找建议,我们稍后会手动检查。 然后,网络的力量可以揭示更多的候选者。
I googled for “gender api” and found this site, which looked really nice and came with 500 free monthly requests and a simple HTTP API. Perfect for my late-night (3am) goal.
我在Google上搜索了“性别api”,然后发现了这个网站 ,它看起来非常不错,并且每月有500个免费请求和一个简单的HTTP API。 非常适合我的深夜(3am)目标。
I tested a few of the names with that came back to Peggy’s request:Peggy, Bonnie, Belén, Robin, Danielle, and Morgan. Unfortunately, only a few got recommended to her, so I hoped that I could do better.
我测试了几个符合Peggy要求的名字:Peggy,Bonnie,Belen,Robin,Danielle和Morgan。 不幸的是,只有少数人推荐给她,所以我希望自己能做得更好。
I used the interactive first name check at the gender-API homepage, which resulted in these results. I had to change the country to “US” as my default (“DE”) didn’t have the correct mapping for Robin and Morgan.
我使用了性别API主页上的交互式名字检查,从而得到了这些结果。 我必须将国家/地区更改为“美国”,因为我的默认设置(“ DE”)没有针对Robin和Morgan的正确映射。
Peggy{"name":"peggy","country":"US","gender":"female","samples":3015,"accuracy":99,"duration":"51ms"}Bonnie{"name":"bonnie","country":"US","gender":"female","samples":3984,"accuracy":98,"duration":"25ms"}Morgan{"name":"morgan","country":"US","gender":"female","samples":5956,"accuracy":76,"duration":"33ms"}Belén{"name":"belén","country":"US","gender":"female","samples":35,"accuracy":97,"duration":"64ms"}Danielle {"name":"danielle","country":"US","gender":"female","samples":12284,"accuracy":99,"duration":"47ms"}Robin{"name":"robin","country":"US","gender":"female","samples":8088,"accuracy":83,"duration":"31ms"}
Based on this data, I think it makes sense to only look at results with an accuracy of more than 75 and at least 10 samples.
基于这些数据,我认为仅查看准确度超过75个且至少10个样本的结果是有意义的。
You can use the HTTP API like this: https://gender-api.com/get?key=<key>&country=US&name
=peggy
您可以像这样使用HTTP API: https://gender-api.com/get?key=<key>&country=US&name
://gender-api.com/get https://gender-api.com/get?key=<key>&country=US&name
key=<key> https://gender-api.com/get?key=<key>&country=US&name
country=US https://gender-api.com/get?key=<key>&country=US&name
name = peggy
Let’s try the same for our community-graph:
让我们为社区图尝试相同的方法:
We match twitter users by a list of screen-names, and
我们通过屏幕名称列表来匹配 Twitter用户,并且
split their name by space and take the first word as firstname.
用空格分开他们的名字,并以第一个单词为名字。
Which we then send to the “gender-api” API (by calling a user-defined procedure) and
然后将其发送到“ gender-api” API(通过调用用户定义的过程)并
get the result back as a map-value.
将结果作为map- value返回 。
We only want to return a few attributes from our user node.
我们只想从用户节点返回一些属性 。
MATCH (user:Twitter:User) WHERE user.screen_name IN ['bonnster75','peggyrayzis','okbel','morgancodes', 'robin_heinze','danimman']
WITH user, head(split(user.name," ")) as firstname
CALL apoc.load.json("https://gender-api.com/get?key=<key>&country=US&name="+firstname) YIELD value
RETURN user { .screen_name, .name, .followers, .statuses} as user_data, firstname, value;
This worked well. Although Morgan got recommended to Peggy, she hadn’t tweeted yet and would probably not be in our “top active” list.
这很好。 尽管Morgan被推荐给Peggy,但她尚未发过推文,很可能不在我们的“最活跃”名单中。
user: {"name":"Bonnie Brennan","screen_name":"bonnster75", "followers":"467","statuses":"2831"}value: {"name":"bonnie","accuracy":"98","samples":"3984", "country":"US","gender":"female"}user: {"name":"Belén Curcio","screen_name":"okbel", "followers":"3821","statuses":"35721"}value: {"name":"belén","accuracy":"97","samples":"35", "country":"US","gender":"female"}user: {"name":"Morgan Laco","screen_name":"morgancodes", "followers":null,"statuses":null}value: {"name":"morgan","accuracy":"76","samples":"5956", "country":"US","gender":"female"}
Now we want to find the “most active” women who tweet about GraphQL. A “score” could contain the number of tweets, and how often those tweets have been favorited, retweeted, or replied to. This is what we do here, we find users who have posted tweets, compute that score per user, and return the top 500 sorted by score.
现在,我们想找到在GraphQL上发布推文的“ 最活跃 ”女性。 “ 分数 ”可以包含推文的数量 ,以及这些推文被收藏,转发或回复的频率。 这就是我们在这里所做的,我们找到发布了推文的用户,计算每个用户的得分,然后返回按得分排序的前500名。
MATCH (u:Twitter:User)-[:POSTED]->(t:Tweet)WITH u, count(*) as tweets, sum(t.favorites+size((t)<-[:RETWEETED|REPLIED_TO]-())) as scoreWHERE tweets > 5 AND tweets * score > 100RETURN u.name, u.screen_name, tweets, scoreORDER BY tweets * score DESC LIMIT 500
Looking at the results, it makes sense:
查看结果很有意义:
╒══════════════════════╤═════════════════╤════════╤═══════╕│"u.name" │"u.screen_name" │"tweets"│"score"│╞══════════════════════╪═════════════════╪════════╪═══════╡│"Sashko Stubailo" │"stubailo" │"538" │"1567" │├──────────────────────┼─────────────────┼────────┼───────┤│"Apollo" │"apollographql" │"150" │"1389" │├──────────────────────┼─────────────────┼────────┼───────┤│"ReactDOM" │"ReactDOM" │"221" │"596" │├──────────────────────┼─────────────────┼────────┼───────┤│"KOYCHEV.DE" │"K0YCHEV" │"309" │"341" │├──────────────────────┼─────────────────┼────────┼───────┤│"Graphcool" │"graphcool" │"84" │"859" │├──────────────────────┼─────────────────┼────────┼───────┤│"adeeb" │"_adeeb" │"179" │"328" │├──────────────────────┼─────────────────┼────────┼───────┤│"ReactJS News" │"ReactJS_News" │"93" │"517" │├──────────────────────┼─────────────────┼────────┼───────┤│"Max Stoiber" │"mxstbr" │"102" │"450" │├──────────────────────┼─────────────────┼────────┼───────┤│"Caleb Meredith" │"calebmer" │"135" │"273" │├──────────────────────┼─────────────────┼────────┼───────┤│"Lee Byron" │"leeb" │"53" │"652" │
Cool, now we can combine our two statements. To save some repeated API calls, I just store the gender information on the user entity (also the accuracy and samples) so that we can reuse it later.
太好了,现在我们可以结合两个语句了。 为了保存一些重复的API调用,我只将性别信息存储在用户实体上(还包括准确性和样本),以便我们以后再使用。
MATCH (u:Twitter:User)-[:POSTED]->(t:Tweet)// name has have at least 2 parts, and gender not yet retrievedWHERE u.name contains " " AND NOT exists(u.gender)
// compute the scoreWITH u, count(*) AS tweets, sum(t.favorites+size((t)<-[:RETWEETED|REPLIED_TO]-())) AS scoreWHERE tweets > 5 AND tweets * score > 100
// top 500 usersWITH u, tweets, score, head(split(u.name," ")) as firstnameORDER BY tweets * score DESC LIMIT 500
// call gender apiCALL apoc.load.json("https://gender-api.com/get?key=<key>&name="+firstname) YIELD value
// set result values as propertiesSET u.gender = value.gender, u.gender_meta = [value.accuracy,value.samples]
RETURN count(*)
So for the 500 top accounts with a space in their name, we got the gender predicted via the API. Now we can look at our resulting data, and hopefully find some women that we can recommend to Peggy.
因此,对于名称中带有空格的500个顶级客户,我们通过API预测了性别。 现在,我们可以查看得出的数据,并希望找到一些可以推荐给Peggy的女士。
MATCH (u:Twitter:User)-[:POSTED]->(t:Tweet)WHERE u.gender = "female" AND u.gender_meta[0] > 75 and u.gender_meta[1] > 10
WITH u, count(*) AS tweets, sum(t.favorites+size((t)<-[:RETWEETED|REPLIED_TO]-())) AS scoreORDER BY tweets * score DESC LIMIT 50
RETURN u { .screen_name, .name, .followers, .following, .statuses} as user, tweets, score;
Besides the funny (Ruby Inside, Else if), and the incorrectly classified (Jess,Brooke), we get a number of active women in the GraphQL community that were not recommended before: ladyleet, _KarimaTounsya, thekamahele, lauralindal, thelamkin, eveporcello and several more.
除了有趣的(Ruby Inside,否则)和分类错误(Jess,Brooke)之外,我们还吸引了GraphQL社区中的一些活跃女性,这些女性以前不被推荐: ladyleet,_KarimaTounsya,thekamahele,lauralindal,thelamkin,eveporcello等。
I manually went over the screen-names, looked at these twitter profiles, and set a check-mark √ for female accounts and an ! for new names.
我手动查看了屏幕名称,查看了这些Twitter资料,并为女性帐户设置了复选标记√ ,并为设置了! 为新名称。
We found 22 women in total — which is of course not a lot if you look at the absolute number of people tweeting about GraphQL, but it is a start and hopefully growing quickly.
我们总共发现22位女性-如果您关注有关GraphQL的绝对人数,这当然不是很多,但这是一个开始,并有望Swift发展。
Now that we have our list, let’s put it to good use! Be sure to check out the work of these talented and active women and follow them on Twitter if you aren’t already. By recognizing their contributions, we can hopefully inspire more women to be active members of the GraphQL community and discover more names for our list in the future.
现在我们有了清单,让我们充分利用它吧! 如果您还没有的话,请务必查看这些才华横溢且活跃的女性的工作,并在Twitter上关注他们。 通过认识到她们的贡献,我们有望激发更多女性成为GraphQL社区的活跃成员,并在将来为我们的榜单找到更多的名字。
PS: We heard from Nikolas, the curator of @graphqlweekly, that our “this week in GraphQL” overview page helped them a lot in compiling the weekly newsletter. It also features a “Twitter Active” tab, which should help you to find people to follow, too.
PS:我们从@graphqlweekly的策展人Nikolas 得知 , “本周的GraphQL”概述页帮助他们在编写每周时事通讯方面发挥了很大作用。 它还具有“ Twitter Active”标签,该标签也将帮助您找到需要关注的人。
We’re also happy to offer the community graph service to other communities, so feel free to reach out to us via devrel@neo4j.com, if you’re interested.
我们也很高兴为其他社区提供社区图服务 ,如果您有兴趣,请随时通过devrel@neo4j.com与我们联系 。
PPS: Thanks so much to Peggy Rayzis who started this engaging activity, provided very valuable feedback for this post, and gave her permission for publication. Make sure to follow her on Twitter and here on Medium.
PPS:非常感谢Peggy Rayzis发起了这项引人入胜的活动,为此帖子提供了非常有价值的反馈,并允许她发表。 确保在Twitter和Medium上关注她。
graphql