R社交网络分析包在传染病传播链可视化的应用

初衷 在这次疫情处理过程中,了解到在梳理传播链的过程中,很多时候仍然是手工在powerpoint等软件绘制传播链的。采用这种方式的优点能够在图中根据设计者需要加入较多的信息,比如:人员大致位置分布,接触的途径和强度等信息。不足之处在于,在链条上节点(感染者)较少的时候还能够梳理得很明确,但一旦节点达到一定数量,其中关系复杂度将呈几何倍数增加(比如1人传多个,1人与多个感染者有接触之类)。 在这种情况下。单纯的手工整理,将耗费非常多的脑力。最严重的缺点是,当现场流调信息变更,对链条进行修订的时,其中一个节点或链接的变化,会因连锁作用导致整个链条的变化。节点越多,变化的影响范围越大,越复杂,就像整理线头一样。当感染者人数上升到一定数量时,手动整理已经变成了一件难以完成的事情。由于本人对R的热衷,探索了一下能不能使用软件自动化链就是自己懒嘛的方式绘制传播,使用igraph,ggraph和networkD3最终效果如下面几张图,个人觉得还是networkD3炫酷的互动效果最好。 具体制作过程 参见我使用的3个包的说明。。。。。详细步骤待补充。 数据 节点数据 节点数据里面只需要包含所有感染者的基本信息,比如编号,姓名,类别等等。 边数据 边数据最基础的要求为,节点数据左右感染者的对应关系,简单说就像Excel两列,第一列from, 第二列to,代表每一行两个感染者的关系,从谁传播到谁,当然这些资料需要辛苦在现场的流调专家们提供。 可视化 igraph 首先使用graph_from_data_frame(d =line, vertices = node, directed = T)将节点和边转换成igraph,就可以直接plot第一张图, 参数自己可以调节。 netwokd3 个人最喜欢的效果,使用igraph_to_networkD3命令,将igraph数据转换一下,就可以使用simpleNetwork,forceNetwork,sankeyNetwork(画出交互性网络图了。试验了下,手机浏览器一样可以互动,包括拖动节点,放大,移动等,非常棒的体验。 {"x":{"links":{"source":[2,8,10,12,3,7,56,2,14,13,1,56,0,2,2,2,2,3,4,5,6,7,7,1,56,1,9,2,11,6,24,3,56,7,12,12,1,12,12,6,7,10,15,15,7,28,30,12,7,56,16,14,14,17,37,18,53,2,56,1,5,3,6],"target":[10,11,12,12,13,14,15,16,17,18,19,19,1,20,21,22,23,24,25,26,27,28,29,2,2,30,31,32,33,34,35,35,36,37,38,39,3,40,41,42,43,44,45,46,47,47,47,48,49,4,50,51,52,53,53,54,55,5,6,7,8,9,9],"value":[8,4,8,3,8,8,4,8,8,4,8,3,3,8,8,8,8,8,8,8,8,8,8,8,3,4,8,8,8,8,3,8,4,8,6,8,3,8,8,8,8,8,8,8,8,3,3,8,8,4,8,8,8,8,3,8,4,8,4,4,8,4,2],"colour":["#666","#666","#666","#666","#666","#666","#666","#666","#666","#666","#666","#666","#666","#666","#666","#666","#666","#666","#666","#666","#666","#666","#666","#666","#666","#666","#666","#666","#666","#666","#666","#666","#666","#666","#666","#666","#666","#666","#666","#666","#666","#666","#666","#666","#666","#666","#666","#666","#666","#666","#666","#666","#666","#666","#666","#666","#666","#666","#666","#666","#666","#666","#666"]},"nodes":{"name":["0号","1号","2号","3号","4号","5号","6号","7号","8号","9号","10号","11号","12号","13号","14号","15号","16号","17号","18号","19号","20号","21号","22号","23号","24号","25号","26号","27号","28号","29号","30号","31号","32号","33号","34号","35号","36号","37号","38号","39号","40号","41号","42号","43号","44号","45号","46号","47号","48号","49号","50号","51号","52号","53号","54号","55号","56号"],"group":[2,2,2,1,7,2,1,6,5,1,2,5,8,4,3,9,2,3,4,2,2,2,2,2,1,7,2,1,6,6,6,1,2,5,1,1,2,3,8,8,8,8,1,6,2,9,9,6,8,6,2,3,3,3,4,3,2],"nodesize":[200,150,30.6,34,35.6,43.1,40.5,10.8,46.9,44.5,42.8,46.8,22.8,27.3,25.4,18.7,11,14.2,23.1,12.4,31.3,11.1,21.7,11.6,11.2,1.3,24,48.2,33.1,2.9,29.2,15.5,45.1,38,23.8,44.4,32.7,15,36,41.2,4.8,22.9,1.6,38.9,41.1,38.6,43.3,9.9,2.1,46.1,12,42.3,1.8,18.9,40.6,27.9,20]},"options":{"NodeID":"name","Group":"group","colourScale":"d3.scaleOrdinal(d3.schemeCategory20);","fontSize":12,"fontFamily":"serif","clickTextSize":30,"linkDistance":50,"linkWidth":"function(d) { return Math.sqrt(d.value); }","charge":-30,"opacity":2,"zoom":true,"legend":false,"arrows":true,"nodesize":true,"radiusCalculation":" Math.sqrt(d.nodesize)+6","bounded":false,"opacityNoHover":1,"clickAction":null}},"evals":[],"jsHooks":[]} {"x":{"links":{"source":[2,8,10,12,3,7,56,2,14,13,1,56,0,2,2,2,2,3,4,5,6,7,7,1,56,1,9,2,11,6,24,3,56,7,12,12,1,12,12,6,7,10,15,15,7,28,30,12,7,56,16,14,14,17,37,18,53,2,56,1,5,3,6],"target":[10,11,12,12,13,14,15,16,17,18,19,19,1,20,21,22,23,24,25,26,27,28,29,2,2,30,31,32,33,34,35,35,36,37,38,39,3,40,41,42,43,44,45,46,47,47,47,48,49,4,50,51,52,53,53,54,55,5,6,7,8,9,9],"value":[8,4,8,3,8,8,4,8,8,4,8,3,3,8,8,8,8,8,8,8,8,8,8,8,3,4,8,8,8,8,3,8,4,8,6,8,3,8,8,8,8,8,8,8,8,3,3,8,8,4,8,8,8,8,3,8,4,8,4,4,8,4,2]},"nodes":{"name":["0号","1号","2号","3号","4号","5号","6号","7号","8号","9号","10号","11号","12号","13号","14号","15号","16号","17号","18号","19号","20号","21号","22号","23号","24号","25号","26号","27号","28号","29号","30号","31号","32号","33号","34号","35号","36号","37号","38号","39号","40号","41号","42号","43号","44号","45号","46号","47号","48号","49号","50号","51号","52号","53号","54号","55号","56号"],"group":["0号","1号","2号","3号","4号","5号","6号","7号","8号","9号","10号","11号","12号","13号","14号","15号","16号","17号","18号","19号","20号","21号","22号","23号","24号","25号","26号","27号","28号","29号","30号","31号","32号","33号","34号","35号","36号","37号","38号","39号","40号","41号","42号","43号","44号","45号","46号","47号","48号","49号","50号","51号","52号","53号","54号","55号","56号"]},"options":{"NodeID":"name","NodeGroup":"name","LinkGroup":null,"colourScale":"d3.scaleOrdinal(d3.schemeCategory20);","fontSize":16,"fontFamily":null,"nodeWidth":15,"nodePadding":10,"units":"Letter(s)","margin":{"top":null,"right":null,"bottom":null,"left":null},"iterations":20,"sinksRight":true}},"evals":[],"jsHooks":[]} ggraph ggraph基本研用了ggplot2绘图的方式,画出来的图也相对更漂亮。首先使用tidygraph包将igraph类型的数据转换为ggraph更合适的元数据。然后可以愉快地使用ggplot2的方式画图了。

March 25, 2022 · Luo Fei

上海疫情数据简单可视化分析(4月20日更新)

今天本地的疫情终于没有增加,抽了点时间关注其他地区的疫情形势。看官方通报的数据,对曾经的模范城市的疫情有兴趣,决定来简单看看。 1 数据获取 1.1 数据来源 要获取准确的数据,当然是上官方网站。打开上海市卫健委的官网(https://wsjkw.sh.gov.cn/xwfb/index.html),疫情数据公告都在“新闻发布”栏目中疫情信息的标题中就包含了所有新增、确诊数据。真是太方便了。 library(rvest) library(tidyverse) library(lubridate) library(readxl) library(openxlsx) library(ggforce) library(mgcv) library(deSolve) library(FME) 1.2 数据 上海的疫情变化主要从3月开始 1.3 数据清洗 这一步比较麻烦的是对标题中日期字符的整理。使用str_extract_all命令后提取的日期,变成了列表。再合并为向量形势的日期格式数据时出了点麻烦。最后使用了笨办法for循环unlist后再paste0合并。其实直接用标题前的日期-1没什么大的误差,主要是在跟自己较劲搞得这么麻烦。 2 好了,开始分析吧 2.1 先画个简单的图看看大趋势 以报告感染者类型为颜色看,从20年1月到21年12月期间,上海的感染人数几乎在处于一个长期稳定的状态,从2022年3月开始呈直升飞机式的增长。奇怪的是中间咋有个空白区,没有数据。查看原始网页,发现网站从2021年11月6日-2022年1月1日没有更新数据。这是一个奇怪的现象。不过没关系,这不影响我们后面的分析,这对原因分析有很大关系。 2.2 输入感染者的趋势 对上海这波疫情有个合理的猜测是,1月初上海接纳大量某地的航班的,导致上海市输入疫情压力陡增,再加上Omicron变异株超强传播能力,双重压力下导致这个模范城市失守。好吧,我们来看看是否能验证输入压力陡增这个猜测。从图上看新增输入的感染者数量变化并不大,鉴于确诊和无症状都属与感染这,下一步我们把本地和输入的感染者的合计数的变化可视化看看。 2.3 感染者总数 先用二者做个散点图看看,如图,完全看不出啥关系啊。。。。输入感染者较多的时候,反而本地感染处于低水平。这个图像,线性回归暂时也不考虑。我们还是从感染者数量和时间的关系看看。 画出来如下,大致能看出在3月14日前,上海几乎没有本土感染病例。 2.4 2022年3-4月 我们把时间尺度拉大到2月中旬到3月看看。放大后(下图1)到3月14日,输入感染人数开始上升,此时上海本地感染数量仍然无幅度的改变,处于平稳状态。为了更仔细看清楚变化情况,我们将y轴的数量调整至0-1000例,这时候能清楚看到每天的变化,本地感染者(蓝色)数量从3月16日快速上升者。 2.5 重点关注3月的数据 2.6 地区分布 获取了地区分布的疫情数据,结果很明显,浦东新区感染者数量最多,见下表: Local 2022-04-07 2022-04-06 2022-04-05 2022-04-04 2022-03-28 2022-03-31 2022-03-30 2022-03-29 2022-03-27 2022-04-01 2022-03-24 2022-03-26 2022-04-03 2022-03-23 2022-03-20 2022-03-21 2022-03-19 2022-03-22 2022-03-25 浦东新区 9050 8457 8145 7071 2506 2407 2207 2183 1429 NA 193 323 NA 436 220 169 135 NA NA 闵行区 2257 2409 2937 1381 NA 392 780 987 619 1043 980 972 NA 256 NA 122 NA NA NA 徐汇区 2076 1107 920 1229 NA 226 404 1100 277 639 167 331 NA 212 46 130 122 NA NA 黄浦区 1380 1044 658 970 NA 121 361 110 56 260 204 320 824 40 42 98 72 118 NA 松江区 1288 781 796 1106 NA NA 476 426 190 948 184 158 NA 68 18 76 32 NA NA 普陀区 957 1033 483 254 NA 15 146 113 70 245 38 69 NA 68 36 31 50 NA NA 嘉定区 933 NA 481 237 NA 54 158 255 0 0 130 0 NA 69 0 NA 81 NA NA 长宁区 852 350 84 33 115 256 128 30 90 74 118 29 NA 22 13 52 24 38 8 虹口区 594 668 410 608 NA 128 84 73 27 61 58 50 NA 52 24 46 NA NA NA 宝山区 414 660 554 265 311 17 504 363 94 45 87 153 NA 138 26 134 NA 32 8 杨浦区 601 630 623 220 NA 174 100 99 70 134 21 23 NA 22 16 20 28 NA NA 奉贤区 92 556 144 64 230 183 130 96 90 161 48 168 119 12 26 22 12 NA NA 静安区 381 545 302 50 104 164 102 175 70 189 70 108 338 96 64 88 30 NA 4 崇明区 520 126 79 94 NA 44 386 55 472 166 164 54 NA 108 NA NA NA 38 82 青浦区 493 470 384 314 NA 95 77 174 86 174 26 44 NA 20 8 10 10 40 26 金山区 129 79 77 52 NA 42 66 26 0 49 0 0 NA 0 22 NA NA NA NA 画个玫瑰图看看,其实这个图并并能很好反应数据特征。...

March 24, 2022 · Luo Fei

建站记录(踩坑心得)  [draft]

在繁忙的工作中,断断续续、跌跌撞撞地自学R语言,翻资料,查英文。这一段历程对于一个大叔来说,作实有点辛苦,值得记录。还记得第一次正式接触和使用R还是2015年在上海工作期间。那时候也没有深入的了解和学习,只是基础的学习了平实在工作中可能会使用的基础功能。真正认真开始学习是2019年新冠疫情暴发之初,因为要处理大量数据、分析、绘图、建模,所以认真花费了一段时间来学习。在学习过程中参考了不少大牛的教材,书籍和参考资料等。尤其要感谢一辉。从rmarkdown、bookdown、blogdown、knit,为了那个文学编程,害我还学习了lantex、html、css、pandoc等,可谓是一把辛酸泪啊😭。关于这段历程,后面还是计划开个新章好好记录。这篇文章主要记录下,自己使用blogdown+ hugo + netlify, 踩坑心得 建站历程。 关于blogdown blogdown的具体功能这里不赘述了,希望了解的请参考一辉大神的blogdown,我这里只是记录自己的心得坑: 踩坑记录 自定义“代码高亮”无效(待解决) 根据hugopaper的教程设置发现下面代码有效,设置后代码无highlight。 params: assets: disableHLJS: true 下面这段代码无效,设置后无法显示,上传到netfily同样无效。 markup: highlight: # anchorLineNos: true codeFences: true guessSyntax: true lineNos: true # noClasses: false style: monokai 在highlightjs 网站下载相关的css样式后,将喜欢的样式名称改为an-old-hope.min后放在网站根目录/assets/css可以更改高亮样式,但是建议选择黑暗模式的样式。现在还未找到在网站白天/黑夜模式切换下。css下载地址 更改样式后,白色字体貌似是hugo-paper主题重新定义了的,无法根据样式更改。待解决 baseURL 如果根据教程提示将baseURL改为在netfily上提供的域名,在用自己的域名解析后,次级链接仍然会链接到原netfily域名上 解决方案: 在config.yaml设置如下 baseURL: /

January 12, 2022 · Luo Fei