Django是一款经典的Python Web开发框架,也是最受欢迎的Python开源项目之一。不同于Flask框架,Django是高度集成的,可以帮助开发者快速搭建一个Web项目。从本周开始,我们一起进入Djaong项目的源码解析,加深对django的理解,熟练掌握python web开发。django采用的源码版本是: 4.0.0
,我们首先采用概读法,大概了解django项目的创建和启动过程,包括下面几个部分:
创建好虚拟环境,安装django的 4.0.0
版本。这个版本和官方的最新文档一致,而且有完整的中文翻译版本。我们可以跟随「快速安装指南」创建一个django项目,感受一下其魅力。首先使用 startproject 命令创建一个名叫 hello 的项目,django会在本地搭建一个基础的项目结构,进入hello项目后,可以直接使用 runserver 命令启动项目。
python3 -m django startproject hello
cd hello && python3 manage.py runserver
django提供很好的模块化支持,可以利用它做大型web项目的开发。在project下的模块称为app,一个project可以包括多个app模块, 和flask的blue-print类似。我们继续使用 startapp 命令来创建一个叫做 api 的app。
python3 -m django startapp api
对api-app的 views.py
,我们需要完善一下,增加下面内容:
from django.shortcuts import render
# Create your views here.
from django.http import HttpResponse
def index(request):
return HttpResponse("Hello, Game404. You're at the index.")
然后再创建一个urls.py的模块文件,定义url和view的映射关系,填写:
from django.urls import path
from . import views
urlpatterns = [
path('', views.index, name='index'),
]
完成api-app的实现后,我们需要在project中添加上自定义的api模块。这需要两步,先是在hello-project的setting.py中配置这个api-app:
INSTALLED_APPS = [
'django.contrib.admin',
'django.contrib.auth',
'django.contrib.contenttypes',
'django.contrib.sessions',
'django.contrib.messages',
'django.contrib.staticfiles',
'api',
]
再在hello-project的urls.py导入api-app中定义的url和视图:
from django.contrib import admin
from django.urls import path, include
urlpatterns = [
path('admin/', admin.site.urls),
path('api/', include('api.urls')),
]
剩下的内容,都可以使用模版的标准实现。然后我们再次启动项目,访问下面路径:
➜ ~ curl http://127.0.0.1:8000/api/
Hello, Game404. You're at the index.%
我们基本完成了一个最简单的django项目创建和启动,接下来我们一起了解这个流程是如何实现的。
django项目源码大概包括下面一些包:
包 | 功能描述 |
---|---|
apps | django的app管理器 |
conf | 配置信息,主要有项目模版和app模版等 |
contrib | django默认提供的标准app |
core | django核心功能 |
db | 数据库模型实现 |
dispatch | 信号,用于模块解耦 |
forms | 表单实现 |
http | http协议和服务相关实现 |
middleware | django提供的标准中间件 |
template && templatetags | 模版功能 |
test | 单元测试的支持 |
urls | 一些url的处理类 |
utils | 工具类 |
views | 视图相关的实现 |
我也简单对比了一下django和flask的代码情况:
-------------------------------------------------------------------------------
Project files blank comment code
-------------------------------------------------------------------------------
django 716 19001 25416 87825
flask 20 1611 3158 3587
仅从文件数量和代码行数可以看到,django是一个庞大的框架,不同于flask是一个简易框架,有700多个模块文件和近9万行代码。在阅读flask源码前,我们还需要先了解sqlalchemy,werkzeug等依赖框架,从pyproject.toml中的依赖项可发现,django默认就没有其它依赖,可以直接开始。这有点类似ios和Android系统,flask需要各种插件的支持,更为开放;django则是全部集成,使用默认的框架就已经可以处理绝大部分web开放需求了。
django提供了一系列脚手架命令,协助开发者创建和管理django项目。可以使用 help 参数查看命令清单:
python3 -m django --help
Type 'python -m django help <subcommand>' for help on a specific subcommand.
Available subcommands:
[django]
check
compilemessages
createcachetable
dbshell
diffsettings
dumpdata
flush
inspectdb
loaddata
makemessages
makemigrations
migrate
runserver
sendtestemail
shell
showmigrations
sqlflush
sqlmigrate
sqlsequencereset
squashmigrations
startapp
startproject
test
testserver
前面使用到的 startproject , startapp 和 runserver 三个命令是本篇重点介绍的命令,其它的二十多个命令我们在使用到的时候再行介绍。这是 概读法 的精髓,只关注主干,先建立全局视野,再逐步深入。
django模块的main函数在 __main__.py
中提供:
"""
Invokes django-admin when the django module is run as a script.
Example: python -m django check
"""
from django.core import management
if __name__ == "__main__":
management.execute_from_command_line()
可以看到 django.core.management 模块提供脚手架的功能实现。同样在project的 manager.py
中也是通过调用management模块来实现项目启动:
#!/usr/bin/env python
"""Django's command-line utility for administrative tasks."""
import os
import sys
def main():
"""Run administrative tasks."""
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'hello.settings')
try:
from django.core.management import execute_from_command_line
except ImportError as exc:
...
execute_from_command_line(sys.argv)
if __name__ == '__main__':
main()
ManagementUtility的主要结构如下:
class ManagementUtility:
"""
Encapsulate the logic of the django-admin and manage.py utilities.
"""
def __init__(self, argv=None):
# 命令行参数
self.argv = argv or sys.argv[:]
self.prog_name = os.path.basename(self.argv[0])
if self.prog_name == '__main__.py':
self.prog_name = 'python -m django'
self.settings_exception = None
def main_help_text(self, commands_only=False):
"""Return the script's main help text, as a string."""
pass
def fetch_command(self, subcommand):
"""
Try to fetch the given subcommand, printing a message with the
appropriate command called from the command line (usually
"django-admin" or "manage.py") if it can't be found.
"""
pass
...
def execute(self):
"""
Given the command-line arguments, figure out which subcommand is being
run, create a parser appropriate to that command, and run it.
"""
pass
一个好的命令行工具,离不开清晰的帮助输出。默认的帮助信息是调用main_help_text函数:
def main_help_text(self, commands_only=False):
"""Return the script's main help text, as a string."""
usage = [
"",
"Type '%s help <subcommand>' for help on a specific subcommand." % self.prog_name,
"",
"Available subcommands:",
]
commands_dict = defaultdict(lambda: [])
for name, app in get_commands().items():
commands_dict[app].append(name)
style = color_style()
for app in sorted(commands_dict):
usage.append("")
usage.append(style.NOTICE("[%s]" % app))
for name in sorted(commands_dict[app]):
usage.append(" %s" % name)
# Output an extra note if settings are not properly configured
if self.settings_exception is not None:
usage.append(style.NOTICE(
"Note that only Django core commands are listed "
"as settings are not properly configured (error: %s)."
% self.settings_exception))
return '\n'.join(usage)
main_help_text主要利用了下面4个函数去查找命令清单:
def find_commands(management_dir):
"""
Given a path to a management directory, return a list of all the command
names that are available.
"""
command_dir = os.path.join(management_dir, 'commands')
return [name for _, name, is_pkg in pkgutil.iter_modules([command_dir])
if not is_pkg and not name.startswith('_')]
def load_command_class(app_name, name):
"""
Given a command name and an application name, return the Command
class instance. Allow all errors raised by the import process
(ImportError, AttributeError) to propagate.
"""
module = import_module('%s.management.commands.%s' % (app_name, name))
return module.Command()
def get_commands():
commands = {name: 'django.core' for name in find_commands(__path__[0])}
if not settings.configured:
return commands
for app_config in reversed(list(apps.get_app_configs())):
path = os.path.join(app_config.path, 'management')
commands.update({name: app_config.name for name in find_commands(path)})
return commands
def call_command(command_name, *args, **options):
pass
find_commands查找management/commands下的命令文件,load_command_class使用 import_module 这个动态加载模块的方法导入命令。
子命令的执行是通过fetch_command找到子命令,然后执行子命令的run_from_argv方法。
def execute(self):
...
self.fetch_command(subcommand).run_from_argv(self.argv)
def fetch_command(self, subcommand):
"""
Try to fetch the given subcommand, printing a message with the
appropriate command called from the command line (usually
"django-admin" or "manage.py") if it can't be found.
"""
# Get commands outside of try block to prevent swallowing exceptions
commands = get_commands()
try:
app_name = commands[subcommand]
except KeyError:
...
if isinstance(app_name, BaseCommand):
# If the command is already loaded, use it directly.
klass = app_name
else:
klass = load_command_class(app_name, subcommand)
return klass
在django的core中包含了下面这些命令,多数都直接继承自BaseCommand:
BaseCommand的主要代码结构:
最重要的run_from_argv和execute方法都是命令的执行入口:
def run_from_argv(self, argv):
...
parser = self.create_parser(argv[0], argv[1])
options = parser.parse_args(argv[2:])
cmd_options = vars(options)
# Move positional args out of options to mimic legacy optparse
args = cmd_options.pop('args', ())
handle_default_options(options)
try:
self.execute(*args, **cmd_options)
except CommandError as e:
...
def execute(self, *args, **options):
output = self.handle(*args, **options)
return output
handler方法则留给子类覆盖实现:
def handle(self, *args, **options):
"""
The actual logic of the command. Subclasses must implement
this method.
"""
raise NotImplementedError('subclasses of BaseCommand must provide a handle() method')
startproject
和startapp
两个命令分别创建项目和app,都派生自TemplateCommand。主要功能实现都在TemplateCommand中。
# startproject
def handle(self, **options):
project_name = options.pop('name')
target = options.pop('directory')
# Create a random SECRET_KEY to put it in the main settings.
options['secret_key'] = SECRET_KEY_INSECURE_PREFIX + get_random_secret_key()
super().handle('project', project_name, target, **options)
# startapp
def handle(self, **options):
app_name = options.pop('name')
target = options.pop('directory')
super().handle('app', app_name, target, **options)
我们使用startproject命令后的项目结构大概如下:
├── hello
│ ├── __init__.py
│ ├── asgi.py
│ ├── settings.py
│ ├── urls.py
│ └── wsgi.py
└── manage.py
这和conf下的project_templat
这和conf下的project_template目录中的模版文件一致:
.
├── manage.py-tpl
└── project_name
├── __init__.py-tpl
├── asgi.py-tpl
├── settings.py-tpl
├── urls.py-tpl
└── wsgi.py-tpl
manage.py-tpl模版文件内容:
#!/usr/bin/env python
"""Django's command-line utility for administrative tasks."""
import os
import sys
def main():
"""Run administrative tasks."""
os.environ.setdefault('DJANGO_SETTINGS_MODULE', '{{ project_name }}.settings')
try:
from django.core.management import execute_from_command_line
except ImportError as exc:
....
if __name__ == '__main__':
main()
可见startproject命令的功能是接收开发者输入的project_name,然后渲染到模版文件中,再生成项目文件。
TemplateCommand中模版处理的函数主要内容如下:
from django.template import Context, Engine
def handle(self, app_or_project, name, target=None, **options):
...
base_name = '%s_name' % app_or_project
base_subdir = '%s_template' % app_or_project
base_directory = '%s_directory' % app_or_project
camel_case_name = 'camel_case_%s_name' % app_or_project
camel_case_value = ''.join(x for x in name.title() if x != '_')
...
context = Context({
**options,
base_name: name,
base_directory: top_dir,
camel_case_name: camel_case_value,
'docs_version': get_docs_version(),
'django_version': django.__version__,
}, autoescape=False)
...
template_dir = self.handle_template(options['template'],
base_subdir)
...
for root, dirs, files in os.walk(template_dir):
for filename in files:
if new_path.endswith(extensions) or filename in extra_files:
with open(old_path, encoding='utf-8') as template_file:
content = template_file.read()
template = Engine().from_string(content)
content = template.render(context)
with open(new_path, 'w', encoding='utf-8') as new_file:
new_file.write(content)
django.template.Engine如何工作,和模版引擎mako有什么区别,以后再行介绍,本章我们只需要了解即可。
runserver提供了一个开发测试的http服务,帮助启动django项目,也是django使用频率最高的命令。django项目遵循wsgi规范,执行启动之前需要先查找wsgi-application。(如果对wsgi规范不了解的同学,欢迎翻看之前的文章)
def get_internal_wsgi_application():
"""
Load and return the WSGI application as configured by the user in
``settings.WSGI_APPLICATION``. With the default ``startproject`` layout,
this will be the ``application`` object in ``projectname/wsgi.py``.
This function, and the ``WSGI_APPLICATION`` setting itself, are only useful
for Django's internal server (runserver); external WSGI servers should just
be configured to point to the correct application object directly.
If settings.WSGI_APPLICATION is not set (is ``None``), return
whatever ``django.core.wsgi.get_wsgi_application`` returns.
"""
from django.conf import settings
app_path = getattr(settings, 'WSGI_APPLICATION')
if app_path is None:
return get_wsgi_application()
try:
return import_string(app_path)
except ImportError as err:
...
结合注释可以知道这里会载入开发者在project的settings中定义的application:
WSGI_APPLICATION = 'hello.wsgi.application'
默认情况下,自定义的wsgi-application是这样的:
import os
from django.core.wsgi import get_wsgi_application
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'hello.settings')
application = get_wsgi_application()
继续查看runserver的执行, 主要是 inner_run 函数:
from django.core.servers.basehttp import run
def inner_run(self, *args, **options):
try:
handler = self.get_handler(*args, **options)
run(self.addr, int(self.port), handler,
ipv6=self.use_ipv6, threading=threading, server_cls=self.server_cls)
except OSError as e:
...
# Need to use an OS exit because sys.exit doesn't work in a thread
os._exit(1)
except KeyboardInterrupt:
..
sys.exit(0)
http服务的功能实现由 django.core.servers.basehttp
提供,完成http服务和wsgi之间的衔接,其架构图如下:
run函数的代码:
def run(addr, port, wsgi_handler, ipv6=False, threading=False, server_cls=WSGIServer):
server_address = (addr, port)
if threading:
httpd_cls = type('WSGIServer', (socketserver.ThreadingMixIn, server_cls), {})
else:
httpd_cls = server_cls
httpd = httpd_cls(server_address, WSGIRequestHandler, ipv6=ipv6)
if threading:
# ThreadingMixIn.daemon_threads indicates how threads will behave on an
# abrupt shutdown; like quitting the server by the user or restarting
# by the auto-reloader. True means the server will not wait for thread
# termination before it quits. This will make auto-reloader faster
# and will prevent the need to kill the server manually if a thread
# isn't terminating correctly.
httpd.daemon_threads = True
httpd.set_app(wsgi_handler)
httpd.serve_forever()
django的wsgi-application实现, http协议的实现,下一章再行详细介绍,我们也暂时跳过。在runserver中还有一个非常重要的功能: 自动重启服务 。如果我们修改了项目的代码,服务会自动重启,可以提高开发效率。
比如我们修改api的视图功能,随便增加几个字符,可以在控制台看到大概下面的输出:
# python manage.py runserver
Watching for file changes with StatReloader
Performing system checks...
System check identified no issues (0 silenced).
March 05, 2022 - 09:06:07
Django version 4.0, using settings 'hello.settings'
Starting development server at http://127.0.0.1:8000/
Quit the server with CONTROL-C.
/Users/yoo/tmp/django/hello/api/views.py changed, reloading.
Watching for file changes with StatReloader
Performing system checks...
System check identified no issues (0 silenced).
March 05, 2022 - 09:12:59
Django version 4.0, using settings 'hello.settings'
Starting development server at http://127.0.0.1:8000/
Quit the server with CONTROL-C.
runserver命令检测到 /hello/api/views.py 有修改,然后自动使用StatReloader重启服务。
inner_run有下面2种启动方式, 默认情况下使用autoreload启动:
def run(self, **options):
"""Run the server, using the autoreloader if needed."""
...
use_reloader = options['use_reloader']
if use_reloader:
autoreload.run_with_reloader(self.inner_run, **options)
else:
self.inner_run(None, **options)
autoreload的继承关系如下:
class BaseReloader:
pass
class StatReloader(BaseReloader):
pass
class WatchmanReloader(BaseReloader):
pass
def get_reloader():
"""Return the most suitable reloader for this environment."""
try:
WatchmanReloader.check_availability()
except WatchmanUnavailable:
return StatReloader()
return WatchmanReloader()
当前版本优先使用WatchmanReloader实现,这依赖于pywatchman库,需要额外安装。否则使用StatReloader的实现,这个实现在之前介绍werkzeug中也有过介绍,本质上都是持续的监听文件的状态变化。
class StatReloader(BaseReloader):
SLEEP_TIME = 1 # Check for changes once per second.
def tick(self):
mtimes = {}
while True:
for filepath, mtime in self.snapshot_files():
old_time = mtimes.get(filepath)
mtimes[filepath] = mtime
if old_time is None:
logger.debug('File %s first seen with mtime %s', filepath, mtime)
continue
elif mtime > old_time:
logger.debug('File %s previous mtime: %s, current mtime: %s', filepath, old_time, mtime)
self.notify_file_changed(filepath)
time.sleep(self.SLEEP_TIME)
yield
当文件有变化后,退出当前进程,并使用subprocess启动新的进程:
def trigger_reload(filename):
logger.info('%s changed, reloading.', filename)
sys.exit(3)
def restart_with_reloader():
new_environ = {**os.environ, DJANGO_AUTORELOAD_ENV: 'true'}
args = get_child_arguments()
while True:
p = subprocess.run(args, env=new_environ, close_fds=False)
if p.returncode != 3:
return p.returncode
runserver命令在执行之前还有很重要的一步就是setup: 加载和初始化开发者自定义的app内容。这是在ManagementUtility的execute函数中开始的:
def execute(self):
...
try:
settings.INSTALLED_APPS
except ImproperlyConfigured as exc:
self.settings_exception = exc
except ImportError as exc:
self.settings_exception = exc
...
Settings中会载下面的一些模块,主要是INSTALLED_APPS:
class Settings:
def __init__(self, settings_module):
...
# store the settings module in case someone later cares
self.SETTINGS_MODULE = settings_module
mod = importlib.import_module(self.SETTINGS_MODULE)
tuple_settings = (
'ALLOWED_HOSTS',
"INSTALLED_APPS",
"TEMPLATE_DIRS",
"LOCALE_PATHS",
)
self._explicit_settings = set()
...
INSTALLED_APPS在项目的setting中定义:
# Application definition
INSTALLED_APPS = [
'django.contrib.admin',
'django.contrib.auth',
'django.contrib.contenttypes',
'django.contrib.sessions',
'django.contrib.messages',
'django.contrib.staticfiles',
'api',
]
这样django框架就完成了开发者自定义的内容动态加载。
Django是一个高度集成的python web开发框架,支持模块化开发。django还提供了一系列脚手架命令,比如使用startproject和startapp协助创建项目和模块模版;使用runserver辅助测试和开发项目。django项目也符合wsgi规范,其http服务的启动中创建了WSGIServer,并且支持多线程模式。django作为一个框架,可以通过约定的配置文件setting动态加载开发者的业务实现。
django命令支持智能提示。比如我们 runserver 命令时,不小心输入错误的把字母 u
打成了 i
。命令会自动提醒我们,是不是想使用 runserver 命令:
python -m django rinserver
No Django settings specified.
Unknown command: 'rinserver'. Did you mean runserver?
Type 'python -m django help' for usage.
智能提示的功能对命令行工具很有帮助,一般的实现就是比较用户输入和已知命令的重合度,从而找到最接近的命令。这是「字符串编辑距离」算法的实际应用。个人认为理解场景,这会比死刷算法更有用,我偶尔会在面试的时候使用这个例子来观测面试人的算法水准。django这里直接使用python标准库difflib提供的实现:
from difflib import get_close_matches
possible_matches = get_close_matches(subcommand, commands)
sys.stderr.write('Unknown command: %r' % subcommand)
if possible_matches:
sys.stderr.write('. Did you mean %s?' % possible_matches[0])
get_close_matches的使用示例:
>>> get_close_matches("appel", ["ape", "apple", "peach", "puppy"])
['apple', 'ape']
对算法感兴趣的同学,可以自己进一步了解其实现细节。
另外一个小技巧是一个命名的异化细节。class 一般在很多开发语言中都是关键字,如果我们要定义一个class类型的变量名时候,避免关键字冲突。一种方法是使用 klass 替代:
#
if isinstance(app_name, BaseCommand):
# If the command is already loaded, use it directly.
klass = app_name
else:
klass = load_command_class(app_name, subcommand)
另一种是使用 clazz 替代:
if ( classes.length ) {
while ( ( elem = this[ i++ ] ) ) {
curValue = getClass( elem );
...
if ( cur ) {
j = 0;
while ( ( clazz = classes[ j++ ] ) ) {
// Remove *all* instances
while ( cur.indexOf( " " + clazz + " " ) > -1 ) {
cur = cur.replace( " " + clazz + " ", " " );
}
}
...
}
}
}
大家一般都用哪一种呢?
Copyright© 2013-2020
All Rights Reserved 京ICP备2023019179号-8