我写了一个fastapi应用程序。现在我正在考虑部署它,但是我似乎遇到了奇怪的意外性能问题,这似乎取决于我使用的是uvicorn还是gunicorn。特别是,如果我使用gunicorn,所有代码(甚至是标准库纯python代码)似乎都会变慢。为了进行性能调试,我编写了一个小应用程序来演示这一点:
import asyncio, time
from fastapi import FastAPI, Path
from datetime import datetime
app = FastAPI()
@app.get("/delay/{delay1}/{delay2}")
async def get_delay(
delay1: float = Path(..., title="Nonblocking time taken to respond"),
delay2: float = Path(..., title="Blocking time taken to respond"),
):
total_start_time = datetime.now()
times = []
for i in range(100):
start_time = datetime.now()
await asyncio.sleep(delay1)
time.sleep(delay2)
times.append(str(datetime.now()-start_time))
return {"delays":[delay1,delay2],"total_time_taken":str(datetime.now()-total_start_time),"times":times}
运行Fastapi Appi与:
gunicorn api.performance_test:app -b localhost:8001 -k uvicorn.workers.UvicornWorker --workers 1
get tohttp://localhost:8001/delay/0.0/0.0
的共振体始终类似于:
{
"delays": [
0.0,
0.0
],
"total_time_taken": "0:00:00.057946",
"times": [
"0:00:00.000323",
...smilar values omitted for brevity...
"0:00:00.000274"
]
}
然而使用:
uvicorn api.performance_test:app --port 8001
我经常得到这样的时机
{
"delays": [
0.0,
0.0
],
"total_time_taken": "0:00:00.002630",
"times": [
"0:00:00.000037",
...snip...
"0:00:00.000020"
]
}
当我取消对wait asyncio的注释时,这种差异变得更加明显。sleep(delay1)
语句。
所以我想知道gunicorn/uvicorn对python/fastapi运行时做了什么,从而在代码执行速度上产生了10倍的差异。
我使用Python3.8执行了这些测试。2在OS X 11.2上。3使用英特尔I7处理器。
这些是我的pip冻结
输出的相关部分
fastapi==0.65.1
gunicorn==20.1.0
uvicorn==0.13.4
由于fastapi
是一个ASGI
框架,因此它将为ASGI
服务器提供更好的性能,如uvicorn
或hypercorn
WSGI
sever-likegunicorn
无法提供类似于uvicorn
的性能ASGI
服务器针对异步
功能进行了优化。fastapi
的官方文件也鼓励使用ASGI
服务器,如uvicorn
或hypercorn
。
https://fastapi.tiangolo.com/#installation
这种差异是由于您使用的底层web服务器造成的。
一个比喻可以是:<代码>两辆车,相同的品牌,相同的选项,只是不同的发动机,有什么区别?
网络服务器并不完全像汽车,但我想你明白我想说的意思了。
基本上,gunicorn
是一个synchronous
web服务器,而uvicorn
是一个asynchronous
web服务器。由于您使用的是fastapi
和wait
关键字,我想您已经知道什么是asyncio
/异步编程。
我不知道代码差异,所以对我的答案持半信半疑的态度,但是uvicorn因为异步部分而更具性能。我对时序差异的猜测是,如果您使用一个
async
Web服务器,它已经在启动时配置为处理async
函数,而如果您使用一个sync
Web服务器,它不是,并且有某种开销来抽象那个部分。
这不是一个合适的答案,但它给了你一个暗示,告诉你区别在哪里。
我的环境:Windows 10上WSL2上的ubuntu
我的pip冻结
输出的相关部分:
fastapi==0.65.1
gunicorn==20.1.0
uvicorn==0.14.0
我稍微修改了代码:
import asyncio, time
from fastapi import FastAPI, Path
from datetime import datetime
import statistics
app = FastAPI()
@app.get("/delay/{delay1}/{delay2}")
async def get_delay(
delay1: float = Path(..., title="Nonblocking time taken to respond"),
delay2: float = Path(..., title="Blocking time taken to respond"),
):
total_start_time = datetime.now()
times = []
for i in range(100):
start_time = datetime.now()
await asyncio.sleep(delay1)
time.sleep(delay2)
time_delta= (datetime.now()-start_time).microseconds
times.append(time_delta)
times_average = statistics.mean(times)
return {"delays":[delay1,delay2],"total_time_taken":(datetime.now()-total_start_time).microseconds,"times_avarage":times_average,"times":times}
除了网站的首次加载,我的两种方法的结果几乎相同。
对于这两种方法,时间大多在0:00:00.000530
和0:00:00.000620
之间。
每个的第一次尝试需要更长时间:大约0:00:00.003000
。然而,当我重新启动Windows并再次尝试这些测试时,我注意到服务器启动后,我不再增加第一次请求的次数(我认为这要归功于重启后的大量免费内存)
非首次运行的示例(3次尝试):
# `uvicorn performance_test:app --port 8083`
{"delays":[0.0,0.0],"total_time_taken":553,"times_avarage":4.4,"times":[15,7,5,4,4,4,4,5,5,4,4,5,4,4,5,4,4,5,4,4,5,4,4,5,4,4,4,5,4,4,5,4,4,5,4,4,4,4,4,5,4,5,5,4,4,4,4,4,4,5,4,4,4,5,4,4,4,4,4,4,5,4,4,5,4,4,4,4,5,4,4,5,4,4,4,4,4,5,4,4,5,4,4,5,4,4,5,4,4,4,4,4,4,4,5,4,4,4,5,4]}
{"delays":[0.0,0.0],"total_time_taken":575,"times_avarage":4.61,"times":[15,6,5,5,5,5,5,5,5,5,5,4,5,5,5,5,4,4,4,4,4,5,5,5,4,5,4,4,4,5,5,5,4,5,5,4,4,4,4,5,5,5,5,4,4,4,4,5,5,4,4,4,4,4,4,4,4,5,5,4,4,4,4,5,5,5,5,5,5,5,4,4,4,4,5,5,4,5,5,4,4,4,4,4,4,5,5,5,4,4,4,4,5,5,5,5,4,4,4,4]}
{"delays":[0.0,0.0],"total_time_taken":548,"times_avarage":4.31,"times":[14,6,5,4,4,4,4,4,4,4,5,4,4,4,4,4,4,5,4,4,5,4,4,4,4,4,4,4,5,4,4,4,5,4,4,4,4,4,4,4,4,5,4,4,4,4,4,4,5,4,4,4,4,4,5,5,4,4,4,4,4,4,4,5,4,4,4,4,4,5,4,4,5,4,4,5,4,4,5,4,4,4,4,4,4,4,5,4,4,5,4,4,5,4,4,5,4,4,4,4]}
# `gunicorn performance_test:app -b localhost:8084 -k uvicorn.workers.UvicornWorker --workers 1`
{"delays":[0.0,0.0],"total_time_taken":551,"times_avarage":4.34,"times":[13,6,5,5,5,5,5,4,4,4,5,4,4,4,4,4,5,4,4,5,4,4,5,4,4,4,4,4,5,4,4,4,4,4,5,4,4,4,4,4,4,4,5,4,4,5,4,4,4,4,4,4,4,4,5,4,4,4,4,4,4,4,5,4,4,4,4,4,4,4,4,4,5,4,4,5,4,5,4,4,5,4,4,4,4,5,4,4,5,4,4,4,4,4,4,4,5,4,4,5]}
{"delays":[0.0,0.0],"total_time_taken":558,"times_avarage":4.48,"times":[14,7,5,5,5,5,5,5,4,4,4,4,4,4,5,5,4,4,4,4,5,4,4,4,5,5,4,4,4,5,5,4,4,4,5,4,4,4,5,5,4,4,4,4,5,5,4,4,5,5,4,4,5,5,4,4,4,5,4,4,5,4,4,5,5,4,4,4,5,4,4,4,5,4,4,4,5,4,5,4,4,4,5,4,4,4,5,4,4,4,5,4,4,4,5,4,4,4,5,4]}
{"delays":[0.0,0.0],"total_time_taken":550,"times_avarage":4.34,"times":[15,6,5,4,4,4,4,4,4,5,4,4,4,4,4,5,4,4,5,4,4,5,4,4,4,4,4,5,4,4,4,4,5,5,4,4,4,4,5,4,4,4,4,4,5,4,4,5,4,4,5,4,4,5,4,4,5,4,4,5,4,4,4,4,4,4,5,4,4,5,4,4,4,4,4,4,4,4,4,5,4,4,5,4,4,4,4,4,4,4,4,5,4,4,5,4,4,4,4,4]}
带有注释的非首次运行示例等待asyncio.sleep(delay1)
(3次尝试):
# `uvicorn performance_test:app --port 8083`
{"delays":[0.0,0.0],"total_time_taken":159,"times_avarage":0.6,"times":[3,1,0,0,1,1,1,1,1,1,1,1,0,0,0,0,0,0,1,1,1,1,1,0,0,1,1,0,0,0,0,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,1,1,1,1,1,0,0,1,0,0,0,0,0,1,1,1,1,1,1,1,1,1,0,0,0,0,1,1,1,1,1,1,1,0,0,0,0,1,1,1,1,1,1,0,0,0,0,0,1,1,1,1,1,0]}
{"delays":[0.0,0.0],"total_time_taken":162,"times_avarage":0.49,"times":[3,0,0,0,0,0,1,1,1,1,1,1,0,0,0,0,1,1,1,1,1,0,0,0,0,0,0,1,1,1,1,1,0,1,0,0,0,0,1,1,1,1,1,0,0,0,0,1,1,1,1,0,0,1,0,0,0,0,1,1,1,1,0,0,0,0,0,0,0,1,1,1,1,0,0,0,0,1,0,0,0,0,1,1,1,1,0,0,0,0,1,1,1,1,0,0,0,0,1,1]}
{"delays":[0.0,0.0],"total_time_taken":156,"times_avarage":0.61,"times":[3,1,1,1,1,1,1,1,0,0,0,0,0,1,1,1,1,1,1,0,0,0,0,0,1,0,1,1,1,1,1,0,0,0,0,0,0,0,1,1,1,1,1,1,0,0,0,0,1,1,1,1,1,1,1,1,1,1,0,0,0,0,1,1,1,1,1,1,1,0,0,0,0,0,1,1,1,1,1,1,0,0,0,0,0,1,1,1,1,1,0,0,0,0,0,1,1,1,1,1]}
# `gunicorn performance_test:app -b localhost:8084 -k uvicorn.workers.UvicornWorker --workers 1`
{"delays":[0.0,0.0],"total_time_taken":159,"times_avarage":0.59,"times":[2,0,0,0,0,1,1,1,1,1,1,0,0,0,0,1,1,1,1,1,0,0,0,0,1,0,1,1,1,1,1,0,0,0,0,0,0,1,1,1,1,1,1,0,0,0,0,1,1,1,1,1,0,1,1,1,1,0,0,0,0,1,1,1,1,1,1,1,0,0,0,0,1,1,1,1,1,1,1,1,0,0,0,0,1,1,1,1,1,0,0,0,0,1,1,1,1,1,0,0]}
{"delays":[0.0,0.0],"total_time_taken":165,"times_avarage":0.62,"times":[3,1,1,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,1,1,1,1,1,1,1,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,1,0,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,0,0,0,0,0,0,1,1,1,1,1]}
{"delays":[0.0,0.0],"total_time_taken":164,"times_avarage":0.54,"times":[2,0,0,0,0,0,0,0,1,1,1,1,1,1,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,1,1,1,1,1,1,1,0,0,0,0,0,0,0,1,1,1,1,1,0,0,0,1,1,0,0,0,0,0,1,1,1,1,1,1,0,0,0,0,0,1,1,1,1,1]}
我制作了一个Python脚本来更精确地测试这些时间:
import statistics
import requests
from time import sleep
number_of_tests=1000
sites_to_test=[
{
'name':'only uvicorn ',
'url':'http://127.0.0.1:8083/delay/0.0/0.0'
},
{
'name':'gunicorn+uvicorn',
'url':'http://127.0.0.1:8084/delay/0.0/0.0'
}]
for test in sites_to_test:
total_time_taken_list=[]
times_avarage_list=[]
requests.get(test['url']) # first request may be slower, so better to not measure it
for a in range(number_of_tests):
r = requests.get(test['url'])
json= r.json()
total_time_taken_list.append(json['total_time_taken'])
times_avarage_list.append(json['times_avarage'])
# sleep(1) # results are slightly different with sleep between requests
total_time_taken_avarage=statistics.mean(total_time_taken_list)
times_avarage_avarage=statistics.mean(times_avarage_list)
print({'name':test['name'], 'number_of_tests':number_of_tests, 'total_time_taken_avarage':total_time_taken_avarage, 'times_avarage_avarage':times_avarage_avarage})
结果:
{'name': 'only uvicorn ', 'number_of_tests': 2000, 'total_time_taken_avarage': 586.5985, 'times_avarage_avarage': 4.820865}
{'name': 'gunicorn+uvicorn', 'number_of_tests': 2000, 'total_time_taken_avarage': 571.8415, 'times_avarage_avarage': 4.719035}
带有注释的的结果等待异步输入。睡眠(延迟1)
{'name': 'only uvicorn ', 'number_of_tests': 2000, 'total_time_taken_avarage': 151.301, 'times_avarage_avarage': 0.602495}
{'name': 'gunicorn+uvicorn', 'number_of_tests': 2000, 'total_time_taken_avarage': 144.4655, 'times_avarage_avarage': 0.59196}
我还制作了另一个版本的上面的脚本,它每1次请求就会更改网址(它给出了稍高的时间):
import statistics
import requests
from time import sleep
number_of_tests=1000
sites_to_test=[
{
'name':'only uvicorn ',
'url':'http://127.0.0.1:8083/delay/0.0/0.0',
'total_time_taken_list':[],
'times_avarage_list':[]
},
{
'name':'gunicorn+uvicorn',
'url':'http://127.0.0.1:8084/delay/0.0/0.0',
'total_time_taken_list':[],
'times_avarage_list':[]
}]
for test in sites_to_test:
requests.get(test['url']) # first request may be slower, so better to not measure it
for a in range(number_of_tests):
for test in sites_to_test:
r = requests.get(test['url'])
json= r.json()
test['total_time_taken_list'].append(json['total_time_taken'])
test['times_avarage_list'].append(json['times_avarage'])
# sleep(1) # results are slightly different with sleep between requests
for test in sites_to_test:
total_time_taken_avarage=statistics.mean(test['total_time_taken_list'])
times_avarage_avarage=statistics.mean(test['times_avarage_list'])
print({'name':test['name'], 'number_of_tests':number_of_tests, 'total_time_taken_avarage':total_time_taken_avarage, 'times_avarage_avarage':times_avarage_avarage})
结果:
{'name': 'only uvicorn ', 'number_of_tests': 2000, 'total_time_taken_avarage': 589.4315, 'times_avarage_avarage': 4.789385}
{'name': 'gunicorn+uvicorn', 'number_of_tests': 2000, 'total_time_taken_avarage': 589.0915, 'times_avarage_avarage': 4.761095}
带有注释的的结果等待异步输入。睡眠(延迟1)
{'name': 'only uvicorn ', 'number_of_tests': 2000, 'total_time_taken_avarage': 152.8365, 'times_avarage_avarage': 0.59173}
{'name': 'gunicorn+uvicorn', 'number_of_tests': 2000, 'total_time_taken_avarage': 154.4525, 'times_avarage_avarage': 0.59768}
这个答案应该可以帮助您更好地调试结果。
我认为如果你分享更多关于你的操作系统/机器的细节,这可能有助于调查你的结果。
也请重新启动您的计算机/服务器,它可能会有影响。
更新1:
我看到我使用的uvicorn0.14.0
比问题0.13.4
中所述的更新版本。我也用旧版本0.13.4
进行了测试,但是结果是相似的,我仍然不能重现你的结果。
更新2:
我又运行了一些基准测试,发现了一些有趣的事情:
整个requirements.txt:
uvicorn==0.14.0
fastapi==0.65.1
gunicorn==20.1.0
uvloop==0.15.2
结果:
{'name': 'only uvicorn ', 'number_of_tests': 500, 'total_time_taken_avarage': 362.038, 'times_avarage_avarage': 2.54142}
{'name': 'gunicorn+uvicorn', 'number_of_tests': 500, 'total_time_taken_avarage': 366.814, 'times_avarage_avarage': 2.56766}
整个requirements.txt:
uvicorn==0.14.0
fastapi==0.65.1
gunicorn==20.1.0
结果:
{'name': 'only uvicorn ', 'number_of_tests': 500, 'total_time_taken_avarage': 595.578, 'times_avarage_avarage': 4.83828}
{'name': 'gunicorn+uvicorn', 'number_of_tests': 500, 'total_time_taken_avarage': 584.64, 'times_avarage_avarage': 4.7155}
更新3:
我只使用了python3.9。5在这个答案中。
问题内容: 我正在尝试在Glassfish上运行Java Web服务。有一些初始化代码可以设置一些变量并从Glassfish环境本身中检索一些信息。我在@WebService类内的静态初始化程序中具有该代码,但是此代码似乎被调用为时过早,它在WebService端点部署后立即运行,而我需要在成功部署整个Web服务后才能运行。 我尝试将代码移到WebService类的构造函数中,但是只有当我进入Te
没有控制台。登录大型for循环的每次迭代(假设我 换句话说,如果没有控制台,迭代会更快吗。日志 我不是问是否会有“任何”差异,但是否有明显的差异(比如10毫秒而不是1毫秒)? 在我的例子中,我谈论的是windows命令提示符和javascript
我最近用Java写了一个计算密集型算法,然后把它翻译成C++。令我吃惊的是,C++的执行速度要慢得多。我现在已经编写了一个更短的Java测试程序,以及一个相应的C++程序-参见下面。我的原始代码具有大量的数组访问功能,测试代码也是如此。C++的执行时间要长5.5倍(请参阅每个程序末尾的注释)。 以下1st21条评论后的结论... null null Java代码: C++代码:
我编写了一个程序来测试的速度。然而,内存的分配方式对速度有很大的影响。 为什么memcpy()的速度每4KB就会急剧下降? 原因与GCC编译器有关,我用不同版本的GCC编译运行了这个程序: GCC版本------------------------
最简单的部署 git clone git@github.com:i5ting/shop-api.git git clone git@github.com:i5ting/shop-admin.git git clone git@github.com:i5ting/shop-h5.git ftp上传 前提是在服务器上部署ftp服务器 推荐使用gulp和ftp部署 https://github.com
问题内容: 我发现有很多方法可以将exec语句用于PDO,但是我不确定它是否对我有帮助。我的理解是我必须对准备好的语句使用execute()函数。我正在使用来自用户输入的数据更新一行,因此我想使用一个准备好的语句而不是query()调用。 我的代码如下: 问题是查询本身没有错误并且可以正常执行,因此在$ f中存储没有错误。但是,我需要知道它是否确实找到要更新的行,然后成功更新了它。换句话说,我需要