以下各节的脚本展示了如何跟踪网络相关的函数和剖析(profile)网络活动。
本节展示SystemTap中剖析网络活动的方式。下面的nettop.stp允许我们一窥每个进程的网络流量使用情况。
nettop.stp
#! /usr/bin/env stap global ifxmit, ifrecv global ifmerged probe netdev.transmit { ifxmit[pid(), dev_name, execname(), uid()] <<< length } probe netdev.receive { ifrecv[pid(), dev_name, execname(), uid()] <<< length } function print_activity() { printf("%5s %5s %-7s %7s %7s %7s %7s %-15s\n", "PID", "UID", "DEV", "XMIT_PK", "RECV_PK", "XMIT_KB", "RECV_KB", "COMMAND") foreach ([pid, dev, exec, uid] in ifrecv) { ifmerged[pid, dev, exec, uid] += @count(ifrecv[pid,dev,exec,uid]); } foreach ([pid, dev, exec, uid] in ifxmit) { ifmerged[pid, dev, exec, uid] += @count(ifxmit[pid,dev,exec,uid]); } foreach ([pid, dev, exec, uid] in ifmerged-) { n_xmit = @count(ifxmit[pid, dev, exec, uid]) n_recv = @count(ifrecv[pid, dev, exec, uid]) printf("%5d %5d %-7s %7d %7d %7d %7d %-15s\n", pid, uid, dev, n_xmit, n_recv, n_xmit ? @sum(ifxmit[pid, dev, exec, uid])/1024 : 0, n_recv ? @sum(ifrecv[pid, dev, exec, uid])/1024 : 0, exec) } print("\n") delete ifxmit delete ifrecv delete ifmerged } probe timer.ms(5000), end, error { print_activity() }
注意看print_activity()的这几个表达式:
print_activity()
n_xmit ? @sum(ifxmit[pid, dev, exec, uid])/1024 : 0 n_recv ? @sum(ifrecv[pid, dev, exec, uid])/1024 : 0
它们也是if/else语句,等价于如下的伪代码:
if/else
if n_recv != 0 then @sum(ifrecv[pid, dev, exec, uid])/1024 else 0
nettop.stp跟踪用了网络流量的进程,并逐个进程输出如下的信息:
eth0
eth1
nettop.stp每隔5秒就会取样一次。你可以修改probe timer.ms(5000)来调整取样间隔。nettop.stp在20秒内的输出如下:
probe timer.ms(5000)
[...] PID UID DEV XMIT_PK RECV_PK XMIT_KB RECV_KB COMMAND 0 0 eth0 0 5 0 0 swapper 11178 0 eth0 2 0 0 0 synergyc PID UID DEV XMIT_PK RECV_PK XMIT_KB RECV_KB COMMAND 2886 4 eth0 79 0 5 0 cups-polld 11362 0 eth0 0 61 0 5 firefox 0 0 eth0 3 32 0 3 swapper 2886 4 lo 4 4 0 0 cups-polld 11178 0 eth0 3 0 0 0 synergyc PID UID DEV XMIT_PK RECV_PK XMIT_KB RECV_KB COMMAND 0 0 eth0 0 6 0 0 swapper 2886 4 lo 2 2 0 0 cups-polld 11178 0 eth0 3 0 0 0 synergyc 3611 0 eth0 0 1 0 0 Xorg PID UID DEV XMIT_PK RECV_PK XMIT_KB RECV_KB COMMAND 0 0 eth0 3 42 0 2 swapper 11178 0 eth0 43 1 3 0 synergyc 11362 0 eth0 0 7 0 0 firefox 3897 0 eth0 0 1 0 0 multiload-apple [...]
本节展示如何跟踪内核的net/socket.c中的函数的调用情况。这将帮助你从细节上看清各进程是怎么跟内核的网络功能打交道的。
net/socket.c
socket-trace.stp
#! /usr/bin/env stap probe kernel.function("*@net/socket.c").call { printf ("%s -> %s\n", thread_indent(1), ppfunc()) } probe kernel.function("*@net/socket.c").return { printf ("%s <- %s\n", thread_indent(-1), ppfunc()) }
socket-trace.stp这个脚本其实在我们之前在第3章介绍thread_indent()的时候已经见过了。下面是它在3秒内的输出:
thread_indent()
[...] 0 Xorg(3611): -> sock_poll 3 Xorg(3611): <- sock_poll 0 Xorg(3611): -> sock_poll 3 Xorg(3611): <- sock_poll 0 gnome-terminal(11106): -> sock_poll 5 gnome-terminal(11106): <- sock_poll 0 scim-bridge(3883): -> sock_poll 3 scim-bridge(3883): <- sock_poll 0 scim-bridge(3883): -> sys_socketcall 4 scim-bridge(3883): -> sys_recv 8 scim-bridge(3883): -> sys_recvfrom 12 scim-bridge(3883):-> sock_from_file 16 scim-bridge(3883):<- sock_from_file 20 scim-bridge(3883):-> sock_recvmsg 24 scim-bridge(3883):<- sock_recvmsg 28 scim-bridge(3883): <- sys_recvfrom 31 scim-bridge(3883): <- sys_recv 35 scim-bridge(3883): <- sys_socketcall [...]
本节展示如何监控TCP连接的创建。这可以帮助你第一时间识别出任何未授权的、可疑的或其它不请自来的网络连接。
tcp_connections.stp
#! /usr/bin/env stap probe begin { printf("%6s %16s %6s %6s %16s\n", "UID", "CMD", "PID", "PORT", "IP_SOURCE") } probe kernel.function("tcp_accept").return?, kernel.function("inet_csk_accept").return? { sock = $return if (sock != 0) printf("%6d %16s %6d %6d %16s\n", uid(), execname(), pid(), inet_get_local_port(sock), inet_get_ip_source(sock)) }
当tcp_connections.stp运行时,它会实时输出新创建的TCP连接的如下信息:
UID CMD PID PORT IP_SOURCE 0 sshd 3165 22 10.64.0.227 0 sshd 3165 22 10.64.0.227
本节展示如何监控收到的TCP包。这可以帮助你分析应用的流量使用情况。
tcpdumplike.stp
#! /usr/bin/env stap // A TCP dump like example probe begin, timer.s(1) { printf("-----------------------------------------------------------------\n") printf(" Source IP Dest IP SPort DPort U A P R S F \n") printf("-----------------------------------------------------------------\n") } probe udp.recvmsg /* ,udp.sendmsg */ { printf(" %15s %15s %5d %5d UDP\n", saddr, daddr, sport, dport) } probe tcp.receive { printf(" %15s %15s %5d %5d %d %d %d %d %d %d\n", saddr, daddr, sport, dport, urg, ack, psh, rst, syn, fin) }
当tcpdumplike.stp运行时,它会实时输出收到的TCP包的如下信息:
tcpdumplike.stp使用了以下函数来获取包的标识信息:
上述函数返回1或0来表示包中是否存在对应的标识。
----------------------------------------------------------------- Source IP Dest IP SPort DPort U A P R S F ----------------------------------------------------------------- 209.85.229.147 10.0.2.15 80 20373 0 1 1 0 0 0 92.122.126.240 10.0.2.15 80 53214 0 1 0 0 1 0 92.122.126.240 10.0.2.15 80 53214 0 1 0 0 0 0 209.85.229.118 10.0.2.15 80 63433 0 1 0 0 1 0 209.85.229.118 10.0.2.15 80 63433 0 1 0 0 0 0 209.85.229.147 10.0.2.15 80 21141 0 1 1 0 0 0 209.85.229.147 10.0.2.15 80 21141 0 1 1 0 0 0 209.85.229.147 10.0.2.15 80 21141 0 1 1 0 0 0 209.85.229.147 10.0.2.15 80 21141 0 1 1 0 0 0 209.85.229.147 10.0.2.15 80 21141 0 1 1 0 0 0 209.85.229.118 10.0.2.15 80 63433 0 1 1 0 0 0 [...]
某些情况下Linux网络栈会丢包。有些版本的Linux内核包含静态内核探测点kernel.trace("kfree_skb"),它可以帮助你跟踪包丢掉的原因。dropwatch.stp就使用了它来跟踪丢包;这个脚本每五秒统计一次丢包的位置。
kernel.trace("kfree_skb")
dropwatch.stp
#! /usr/bin/env stap ############################################################ # Dropwatch.stp # Author: Neil Horman <nhorman@redhat.com> # An example script to mimic the behavior of the dropwatch utility # http://fedorahosted.org/dropwatch ############################################################ # Array to hold the list of drop points we find global locations # Note when we turn the monitor on and off probe begin { printf("Monitoring for dropped packets\n") } probe end { printf("Stopping dropped packet monitor\n") } # increment a drop counter for every location we drop at probe kernel.trace("kfree_skb") { locations[$location] <<< 1 } # Every 5 seconds report our drop locations probe timer.sec(5) { printf("\n") foreach (l in locations-) { printf("%d packets dropped at %s\n", @count(locations[l]), symname(l)) } delete locations }
kernel.trace("kfree_skb")跟踪内核中网络包被丢弃的位置。它有两个参数:一个指向将被释放的缓冲区的指针$skb,和释放缓冲区时的内核位置$location。如果可以获取$location所存储的内核地址上对应的函数名,dropwatch.stp脚本可以把它的值映射成对应的函数。这个映射默认不会启用。对于1.4及以上的SystemTap,你可以指定--all-modules选项来启用该映射:
$skb
$location
--all-modules
stap --all-modules dropwatch.stp
在低版本的SystemTap,你可以使用下面的命令模拟--all-modules选项:
stap -dkernel \ `cat /proc/modules | awk 'BEGIN { ORS = " " } {print "-d"$1}'` \ dropwatch.stp
运行dropwatch.stp15秒会输出类似下面的结果。输出的结果会按函数名或地址聚合丢包的次数。
Monitoring for dropped packets 1762 packets dropped at unix_stream_recvmsg 4 packets dropped at tun_do_read 2 packets dropped at nf_hook_slow 467 packets dropped at unix_stream_recvmsg 20 packets dropped at nf_hook_slow 6 packets dropped at tun_do_read 446 packets dropped at unix_stream_recvmsg 4 packets dropped at tun_do_read 4 packets dropped at nf_hook_slow Stopping dropped packet monitor
当运行脚本的机器不支持--all-modules和/proc/modules时,symname只会输出原始的地址。你可以通过/boot/System.map-$(uname -r)按地址找出对应的函数。下面的/boot/System.map-$(uname -r)片段中,地址0xffffffff8149a8ed映射到函数unix_stream_recvmsg:
/proc/modules
symname
/boot/System.map-$(uname -r)
0xffffffff8149a8ed
unix_stream_recvmsg
[...] ffffffff8149a420 t unix_dgram_poll ffffffff8149a5e0 t unix_stream_recvmsg ffffffff8149ad00 t unix_find_other [...]
Copyright© 2013-2020
All Rights Reserved 京ICP备2023019179号-8