calligraphy/笔名汉字频率分析.md
2022-06-23 13:52:13 +08:00

380 lines
24 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# 笔名汉字频率分析
<br />
> 数据来自《[书法练习轨迹--明月几时有]( https://xushufa.cn )》中关于本人笔名的记录。
<br />
## 1、笔名分析
- 当前笔名:无为徐生
- 注册多少app就有多少账号每个账号都会有昵称或者用户名因为昵称可以修改故而昵称的数量会远远大于账号数量像QQ、微信用了几年甚至十几年昵称不知道变了多少次。简单说明一下手机上安装了多少app大家的昵称数量就会大于这些app的数量。所有人皆是如此所以现代人的昵称都很多很多。
- 笔名就像姓名一样,同名现象相当普遍,因此很难准确找到目标对象。相比而言,邮箱 1021151991@qq.com 具有唯一性,是有效的也是本人常用的。
- 本文档是根据《书法练习轨迹--明月几时有》中本人所记录的历年所用的上百个笔名、昵称、账号使用java分析这些笔名的分类数量及使用频率做比较简单的数据统计和笔名分析。
---
### 1.1、20210605
> 笔名
```
五字:明明之明夜、明明之夜夜、星夜无恒恒、秋水共长天
四字:明明如夜、明如夜天、明来天明、明明夜夜、星夜无恒、如松之盛、天下有道、青青子襟、右耀去明、无为徐生
三字:徐先生、明来其、名字清
二字:博尔、古林、明天、深岸、甲方、龙光、知一、五湖、星名、源线、小明、长天、风云
一字:徐、吾、风、明
字母xyqmmry、txyda、scott、scott180、xy180、xuyq123、sc1、jack、Jack2Java
```
> 分析
```bash
笔名总数43个。其中五字4个,四字10个,三字3个,二字13个,一字4个,字母9个。二字13个最多,三字3个最少。汉字出现次数最多频率最高共17次。
汉字频率:{"明":17,"夜":9,"天":6,"如":3,"徐":3,"星":3,"之":3,"恒":3,"无":3,"名":2,"生":2,"风":2,"青":2,"来":2,"长":2,"耀":1,"一":1,"清":1,"有":1,"下":1,"小":1,"源":1,"云":1,"尔":1,"五":1,"林":1,"龙":1,"甲":1,"水":1,"岸":1,"方":1,"为":1,"去":1,"吾":1,"线":1,"先":1,"光":1,"秋":1,"子":1,"道":1,"湖":1,"字":1,"博":1,"盛":1,"襟":1,"古":1,"知":1,"共":1,"深":1,"右":1,"其":1,"松":1}
```
---
### 1.2、20210708
> 笔名
```
六字:烟火尘埃落定
五字:明明之明夜、明明之夜夜、星夜无恒恒、秋水共长天、青青子衿天、明明如夜夜
四字:明明如夜、明如夜天、明来天明、明明夜夜、星夜无恒、如松之盛、天下有道、青青子襟、右耀去明、秋水长天、无为徐生、清风之明、昨夜星辰
三字:徐先生、明来其、名字清、一世界、雨中曲、天行健、明明了、明明夜
二字:博尔、古林、明天、深岸、甲方、龙光、知一、五湖、星名、源线、小明、长天、风云、老徐、迈克、改之、择之、星夜、一柏、小徐、风格、明业、明飞、问橐、明夜、明一、米明、明云、明也、玄明、斗米、明达
一字:徐、吾、风、明
字母xyqmmry、txyda、scott、scott180、xyqin、xy180、xu180、xuyq123、sc1、jack、Jack2Java、wuhu、longguang123、scott123、yxm、txyd、mmry
```
> 分析
```bash
笔名总数81个。其中六字1个,五字6个,四字13个,三字8个,二字32个,一字4个,字母17个。二字32个最多,六字1个最少。汉字出现次数最多频率最高共33次。
汉字频率:{"明":33,"夜":15,"天":9,"之":6,"徐":5,"星":5,"一":4,"如":4,"风":4,"青":4,"恒":3,"无":3,"长":3,"清":2,"名":2,"小":2,"云":2,"生":2,"水":2,"秋":2,"子":2,"来":2,"米":2,"耀":1,"老":1,"玄":1,"了":1,"有":1,"下":1,"源":1,"尔":1,"五":1,"世":1,"林":1,"斗":1,"尘":1,"龙":1,"定":1,"业":1,"昨":1,"中":1,"辰":1,"甲":1,"岸":1,"方":1,"改":1,"为":1,"去":1,"格":1,"落":1,"达":1,"吾":1,"线":1,"埃":1,"先":1,"迈":1,"光":1,"克":1,"界":1,"行":1,"柏":1,"橐":1,"道":1,"湖":1,"字":1,"博":1,"盛":1,"飞":1,"烟":1,"襟":1,"也":1,"古":1,"健":1,"知":1,"雨":1,"择":1,"火":1,"问":1,"共":1,"深":1,"曲":1,"右":1,"其":1,"松":1,"衿":1}
```
---
### 1.3、20210712
> 笔名
```
六字:烟火尘埃落定
五字:明明之明夜、明明之夜夜、星夜无恒恒、秋水共长天、青青子衿天、明明如夜夜
四字:明明如夜、明如夜天、明来天明、明明夜夜、星夜无恒、如松之盛、天下有道、青青子襟、右耀去明、秋水长天、无为徐生、清风之明、昨夜星辰、东门之旭、长风雨下、清风徐来、原味吐司、星星水源、和山以水
三字:徐先生、明来其、名字清、一世界、雨中曲、天行健、明明了、明明夜、西之白、东门之、东有雨、徐庶一、飞一飞、斜风细、明之夜
二字:博尔、古林、明天、深岸、甲方、龙光、知一、五湖、星名、源线、小明、长天、风云、老徐、迈克、改之、择之、星夜、一柏、小徐、风格、明业、明飞、问橐、明夜、明一、米明、明云、明也、玄明、斗米、明达、徐生、四海、广毅、麦克、和时、尚書
一字:徐、吾、风、明
字母xyqmmry、txyda、scott、scott180、xyqin、xy180、xu180、xuyq123、sc1、jack、Jack2Java、wuhu、longguang123、scott123、yxm、txyd、mmry
```
> 分析
```bash
笔名总数100个。其中六字1个,五字6个,四字19个,三字15个,二字38个,一字4个,字母17个。二字38个最多,六字1个最少。汉字出现次数最多频率最高共34次。
汉字频率:{"明":34,"夜":16,"之":10,"天":9,"徐":8,"星":7,"风":7,"一":6,"水":4,"青":4,"长":4,"如":4,"清":3,"东":3,"生":3,"恒":3,"来":3,"飞":3,"无":3,"雨":3,"有":2,"下":2,"名":2,"小":2,"克":2,"子":2,"米":2,"和":2,"源":2,"云":2,"秋":2,"门":2,"耀":1,"老":1,"吐":1,"尔":1,"世":1,"尘":1,"业":1,"尚":1,"昨":1,"中":1,"甲":1,"改":1,"为":1,"格":1,"落":1,"吾":1,"先":1,"光":1,"界":1,"行":1,"橐":1,"道":1,"湖":1,"字":1,"博":1,"襟":1,"也":1,"健":1,"火":1,"共":1,"山":1,"味":1,"其":1,"海":1,"白":1,"松":1,"衿":1,"西":1,"广":1,"玄":1,"了":1,"五":1,"林":1,"斗":1,"龙":1,"定":1,"斜":1,"原":1,"麦":1,"辰":1,"庶":1,"岸":1,"方":1,"去":1,"达":1,"线":1,"埃":1,"毅":1,"细":1,"迈":1,"柏":1,"盛":1,"四":1,"烟":1,"古":1,"以":1,"知":1,"择":1,"旭":1,"问":1,"深":1,"曲":1,"右":1,"时":1,"司":1,"書":1}
```
---
### 1.4、20210802
> 笔名
```
六字:烟火尘埃落定
五字:明明之明夜、明明之夜夜、星夜无恒恒、秋水共长天、青青子衿天、明明如夜夜
四字:明明如夜、明如夜天、明来天明、明明夜夜、星夜无恒、如松之盛、天下有道、青青子襟、右耀去明、秋水长天、无为徐生、清风之明、昨夜星辰、东门之旭、长风雨下、清风徐来、原味吐司、星星水源、和山以水
三字:徐先生、明来其、名字清、一世界、雨中曲、天行健、明明了、明明夜、西之白、东门之、东有雨、徐庶一、飞一飞、斜风细、明之夜
二字:博尔、古林、明天、深岸、甲方、龙光、知一、五湖、星名、源线、小明、长天、风云、老徐、迈克、改之、择之、星夜、一柏、小徐、风格、明业、明飞、问橐、明夜、明一、米明、明云、明也、玄明、斗米、明达、徐生、四海、广毅、麦克、和时、尚書、张及、李星、赵星、张三、李四、王五、赵六、测试、木叶
一字:徐、吾、风、明
字母xyqmmry、txyda、scott、scott180、xyqin、xy180、xu180、xuyq123、sc1、jack、Jack2Java、wuhu、longguang123、scott123、yxm、txyd、mmry、xyq、xu、gaizhi180、xy18484、test、xu123
```
> 分析
```bash
笔名总数115个。其中六字1个,五字6个,四字19个,三字15个,二字47个,一字4个,字母23个。二字47个最多,六字1个最少。汉字出现次数最多频率最高共34次。
汉字频率:{"明":34,"夜":16,"之":10,"星":9,"天":9,"徐":8,"风":7,"一":6,"水":4,"青":4,"长":4,"如":4,"清":3,"东":3,"生":3,"恒":3,"来":3,"飞":3,"无":3,"雨":3,"有":2,"下":2,"名":2,"小":2,"张":2,"克":2,"李":2,"子":2,"米":2,"赵":2,"和":2,"源":2,"云":2,"五":2,"秋":2,"四":2,"门":2,"耀":1,"老":1,"三":1,"吐":1,"尔":1,"世":1,"尘":1,"业":1,"尚":1,"昨":1,"木":1,"中":1,"甲":1,"改":1,"为":1,"格":1,"落":1,"吾":1,"先":1,"光":1,"测":1,"界":1,"行":1,"橐":1,"道":1,"湖":1,"字":1,"博":1,"襟":1,"也":1,"健":1,"火":1,"六":1,"共":1,"山":1,"味":1,"其":1,"海":1,"白":1,"松":1,"衿":1,"西":1,"广":1,"玄":1,"了":1,"王":1,"林":1,"斗":1,"龙":1,"定":1,"斜":1,"原":1,"麦":1,"辰":1,"庶":1,"岸":1,"方":1,"去":1,"达":1,"线":1,"埃":1,"毅":1,"细":1,"迈":1,"及":1,"柏":1,"试":1,"盛":1,"烟":1,"古":1,"以":1,"知":1,"择":1,"旭":1,"问":1,"深":1,"曲":1,"右":1,"时":1,"叶":1,"司":1,"書":1}
```
---
### 1.5、20211213
> 笔名
```
六字:烟火尘埃落定
五字:明明之明夜、明明之夜夜、星夜无恒恒、秋水共长天、青青子衿天、明明如夜夜
四字:明明如夜、明如夜天、明来天明、明明夜夜、星夜无恒、如松之盛、天下有道、青青子襟、右耀去明、秋水长天、无为徐生、清风之明、昨夜星辰、东门之旭、长风雨下、清风徐来、原味吐司、星星水源、和山以水、安之若素、顿覆三千、呜呼呀嘿
三字:徐先生、明来其、名字清、一世界、雨中曲、天行健、明明了、明明夜、西之白、东门之、东有雨、徐庶一、飞一飞、斜风细、明之夜
二字:博尔、古林、明天、深岸、甲方、龙光、知一、五湖、星名、源线、小明、长天、风云、老徐、迈克、改之、择之、星夜、一柏、小徐、风格、明业、明飞、问橐、明夜、明一、米明、明云、明也、玄明、斗米、明达、徐生、四海、广毅、麦克、和时、尚書、张及、李星、赵星、张三、李四、王五、赵六、测试、木叶、少焉、雾州、行之
一字:徐、吾、风、明
字母xyqmmry、txyda、scott、scott180、xyqin、xy180、xu180、xuyq123、sc1、jack、Jack2Java、wuhu、longguang123、scott123、yxm、txyd、mmry、xyq、xu、gaizhi180、xy18484、test、xu123
```
> 分析
```bash
笔名总数121个。其中六字1个,五字6个,四字22个,三字15个,二字50个,一字4个,字母23个。二字50个最多,六字1个最少。汉字出现次数最多频率最高共34次。
汉字频率:{"明":34,"夜":16,"之":12,"星":9,"天":9,"徐":8,"风":7,"一":6,"水":4,"青":4,"长":4,"如":4,"清":3,"东":3,"生":3,"恒":3,"来":3,"飞":3,"无":3,"雨":3,"有":2,"三":2,"下":2,"名":2,"小":2,"张":2,"克":2,"行":2,"李":2,"子":2,"米":2,"赵":2,"和":2,"源":2,"云":2,"五":2,"秋":2,"四":2,"门":2,"耀":1,"老":1,"焉":1,"吐":1,"少":1,"尔":1,"世":1,"尘":1,"业":1,"尚":1,"素":1,"昨":1,"木":1,"中":1,"甲":1,"改":1,"为":1,"格":1,"落":1,"吾":1,"嘿":1,"呀":1,"千":1,"先":1,"光":1,"测":1,"界":1,"橐":1,"道":1,"湖":1,"字":1,"博":1,"呜":1,"襟":1,"也":1,"健":1,"火":1,"六":1,"共":1,"山":1,"味":1,"其":1,"海":1,"呼":1,"白":1,"松":1,"衿":1,"顿":1,"西":1,"广":1,"玄":1,"覆":1,"了":1,"安":1,"王":1,"林":1,"斗":1,"龙":1,"定":1,"斜":1,"原":1,"麦":1,"辰":1,"庶":1,"岸":1,"方":1,"去":1,"达":1,"线":1,"埃":1,"毅":1,"细":1,"迈":1,"及":1,"柏":1,"试":1,"盛":1,"州":1,"烟":1,"古":1,"以":1,"若":1,"知":1,"择":1,"旭":1,"问":1,"深":1,"曲":1,"右":1,"时":1,"叶":1,"司":1,"書":1,"雾":1}
```
---
### 1.6、20220226
> 笔名
```
六字:烟火尘埃落定
五字:明明之明夜、明明之夜夜、星夜无恒恒、秋水共长天、青青子衿天、明明如夜夜
四字:明明如夜、明如夜天、明来天明、明明夜夜、星夜无恒、如松之盛、天下有道、青青子襟、右耀去明、秋水长天、无为徐生、清风之明、昨夜星辰、东门之旭、长风雨下、清风徐来、原味吐司、星星水源、和山以水、安之若素、顿覆三千、呜呼呀嘿
三字:徐先生、明来其、名字清、一世界、雨中曲、天行健、明明了、明明夜、西之白、东门之、东有雨、徐庶一、飞一飞、斜风细、明之夜、明一天
二字:博尔、古林、明天、深岸、甲方、龙光、知一、五湖、星名、源线、小明、长天、风云、老徐、迈克、改之、择之、星夜、一柏、小徐、风格、明业、明飞、问橐、明夜、明一、米明、明云、明也、玄明、斗米、明达、徐生、四海、广毅、麦克、和时、尚書、张及、李星、赵星、张三、李四、王五、赵六、测试、木叶、少焉、雾州、行之、东门
一字:徐、吾、风、明
字母xyqmmry、txyda、scott、scott180、xyqin、xy180、xu180、xuyq123、sc1、jack、Jack2Java、wuhu、longguang123、scott123、yxm、txyd、mmry、xyq、xu、gaizhi180、xy18484、test、xu123、whatyn、wuhuFly、xuyq、wuhuxy、xyqcalligraphy
```
> 分析
```bash
笔名总数128个。其中六字1个,五字6个,四字22个,三字16个,二字51个,一字4个,字母28个。二字51个最多,六字1个最少。汉字出现次数最多频率最高共35次。
汉字频率:{"明":35,"夜":16,"之":12,"天":10,"星":9,"徐":8,"一":7,"风":7,"东":4,"水":4,"青":4,"长":4,"如":4,"清":3,"生":3,"恒":3,"来":3,"飞":3,"无":3,"门":3,"雨":3,"有":2,"三":2,"下":2,"名":2,"小":2,"张":2,"克":2,"行":2,"李":2,"子":2,"米":2,"赵":2,"和":2,"源":2,"云":2,"五":2,"秋":2,"四":2,"耀":1,"老":1,"焉":1,"吐":1,"少":1,"尔":1,"世":1,"尘":1,"业":1,"尚":1,"素":1,"昨":1,"木":1,"中":1,"甲":1,"改":1,"为":1,"格":1,"落":1,"吾":1,"嘿":1,"呀":1,"千":1,"先":1,"光":1,"测":1,"界":1,"橐":1,"道":1,"湖":1,"字":1,"博":1,"呜":1,"襟":1,"也":1,"健":1,"火":1,"六":1,"共":1,"山":1,"味":1,"其":1,"海":1,"呼":1,"白":1,"松":1,"衿":1,"顿":1,"西":1,"广":1,"玄":1,"覆":1,"了":1,"安":1,"王":1,"林":1,"斗":1,"龙":1,"定":1,"斜":1,"原":1,"麦":1,"辰":1,"庶":1,"岸":1,"方":1,"去":1,"达":1,"线":1,"埃":1,"毅":1,"细":1,"迈":1,"及":1,"柏":1,"试":1,"盛":1,"州":1,"烟":1,"古":1,"以":1,"若":1,"知":1,"择":1,"旭":1,"问":1,"深":1,"曲":1,"右":1,"时":1,"叶":1,"司":1,"書":1,"雾":1}
```
## 2、明 释义
```
拼 音: míng
部 首: 日
笔 画: 8
五 行: 水
繁 体: 明
五 笔: JEG
笔 顺: 竖、横折、横、横、撇、横折钩、横、横
释义:
1.明亮(跟“暗”相对):~月。天~。灯火通~。
2.明白;清楚:问~。讲~。分~。去向不~。
3.公开;显露在外;不隐蔽(跟“暗”相对):~说。~令。~沟。~枪易躲,暗箭难防。
4.眼力好;眼光正确;对事物现象看得清:聪~。英~。精~强干。耳聪目~。眼~手快。
5.光明:弃暗投~。~人不做暗事。
6.视觉:双目失~。
7.懂得;了解:深~大义。不~利害。
8.表明;显示:开宗~义。赋诗~志。
9.明明,意思是显然如此:你~知道他不会,干吗还要为难他呀?
10.次于今年、今天的:~天。~晨。~年。~春。
11.朝代公元1368—1644朱元璋所建。先定都南京永乐年间迁都北京。
12.姓。
```
---
## 3、java代码
```java
package com.xu.calligraphy.boot.common.util;
import com.alibaba.fastjson.JSON;
import java.util.*;
/**
* 笔名汉字频率分析
* 文件位置https://gitlab.com/xuyq123/calligraphy-boot
* 数据来源https://gitlab.com/xuyq123/calligraphy/-/tree/master 书法练习轨迹--明月几时有
*
* @author xyq
* @date 2021-07-08 14:32
*/
public class CalligraphyNicknameAnalysisUtil {
public static void main(String[] args) {
String text = "六字:烟火尘埃落定\n" +
"五字:明明之明夜、明明之夜夜、星夜无恒恒、秋水共长天、青青子衿天、明明如夜夜\n" +
"四字:明明如夜、明如夜天、明来天明、明明夜夜、星夜无恒、如松之盛、天下有道、青青子襟、右耀去明、秋水长天、无为徐生、清风之明、昨夜星辰、东门之旭、长风雨下、清风徐来、原味吐司、星星水源、和山以水\n" +
"三字:徐先生、明来其、名字清、一世界、雨中曲、天行健、明明了、明明夜、西之白、东门之、东有雨、徐庶一、飞一飞、斜风细、明之夜\n" +
"二字:博尔、古林、明天、深岸、甲方、龙光、知一、五湖、星名、源线、小明、长天、风云、老徐、迈克、改之、择之、星夜、一柏、小徐、风格、明业、明飞、问橐、明夜、明一、米明、明云、明也、玄明、斗米、明达、徐生、四海、广毅、麦克、和时、尚書、张及、李星、赵星、张三、李四、王五、赵六、测试、木叶\n" +
"一字:徐、吾、风、明\n" +
"字母xyqmmry、txyda、scott、scott180、xyqin、xy180、xu180、xuyq123、sc1、jack、Jack2Java、wuhu、longguang123、scott123、yxm、txyd、mmry、xyq、xu、gaizhi180、xy18484、test、xu123";
System.out.println(statisticsNum(text));
}
/*
笔名总数115个。其中六字1个,五字6个,四字19个,三字15个,二字47个,一字4个,字母23个。二字47个最多,六字1个最少。汉字出现次数最多频率最高共34次。
汉字频率:{"明":34,"夜":16,"之":10,"星":9,"天":9,"徐":8,"风":7,"一":6,"水":4,"青":4,"长":4,"如":4,"清":3,"东":3,"生":3,"恒":3,"来":3,"飞":3,"无":3,"雨":3,"有":2,"下":2,"名":2,"小":2,"张":2,"克":2,"李":2,"子":2,"米":2,"赵":2,"和":2,"源":2,"云":2,"五":2,"秋":2,"四":2,"门":2,"耀":1,"老":1,"三":1,"吐":1,"尔":1,"世":1,"尘":1,"业":1,"尚":1,"昨":1,"木":1,"中":1,"甲":1,"改":1,"为":1,"格":1,"落":1,"吾":1,"先":1,"光":1,"测":1,"界":1,"行":1,"橐":1,"道":1,"湖":1,"字":1,"博":1,"襟":1,"也":1,"健":1,"火":1,"六":1,"共":1,"山":1,"味":1,"其":1,"海":1,"白":1,"松":1,"衿":1,"西":1,"广":1,"玄":1,"了":1,"王":1,"林":1,"斗":1,"龙":1,"定":1,"斜":1,"原":1,"麦":1,"辰":1,"庶":1,"岸":1,"方":1,"去":1,"达":1,"线":1,"埃":1,"毅":1,"细":1,"迈":1,"及":1,"柏":1,"试":1,"盛":1,"烟":1,"古":1,"以":1,"知":1,"择":1,"旭":1,"问":1,"深":1,"曲":1,"右":1,"时":1,"叶":1,"司":1,"書":1}
*/
/**
* 数量统计
*
* @param text
* @return
*/
private static String statisticsNum(String text) {
String[] lineArr = text.split("\n");
StringBuffer buffer = new StringBuffer();
buffer.append("笔名总数x个。其中");
StringBuffer nicknameText = new StringBuffer();
Integer sum = 0;
Integer max = 0;
Integer min = 100;
String maxText = "";
String minText = "";
Map<String, Integer> nicknameMap = new LinkedHashMap<>();
for (String val : lineArr) {
String[] nicknameArr = val.split("");
String[] nicknameArray = nicknameArr[1].split("、");
int length = nicknameArray.length;
sum += length;
String num = nicknameArr[0] + length + "个";
buffer.append(num).append(",");
if (length > max) {
maxText = num + "最多";
max = length;
}
if (length < min) {
minText = num + "最少";
min = length;
}
if (!"字母".equals(nicknameArr[0])) {
nicknameText.append(nicknameArr[1]).append("、");
}
for (String nickname : nicknameArray) {
nicknameMap.put(nickname, nicknameMap.get(nickname) == null ? 1 : nicknameMap.get(nickname) + 1);
}
}
// 笔名分析
buffer.deleteCharAt(buffer.length() - 1).append("。").append(maxText).append(",").append(minText).append("。");
String statisticsNum = buffer.toString().replace("x", String.valueOf(sum));
// 单字频率
Map<String, Integer> statisticsFrequencyMap = statisticsFrequency(nicknameText.toString());
StringBuffer mostChineseChar = new StringBuffer();
String maxNumChar = statisticsFrequencyMap.keySet().stream().findFirst().get();
mostChineseChar.append("汉字‘").append(maxNumChar).append("’出现次数最多,频率最高,共").append(statisticsFrequencyMap.get(maxNumChar)).append("次。");
// 笔名是否重复
if (nicknameMap.size() != sum) {
System.out.print("提示:笔名有重复,");
nicknameMap.entrySet().stream().filter(nick -> nick.getValue() > 1).forEach(nick -> System.out.print(nick.getKey() + " "));
System.out.println();
System.out.println(nicknameText);
System.out.println(JSON.toJSONString(nicknameMap));
System.out.println();
}
return statisticsNum + mostChineseChar.toString() + "\n\n汉字频率" + JSON.toJSONString(statisticsFrequencyMap);
}
/**
* 单字频率统计
*
* @param text
* @return
*/
private static Map<String, Integer> statisticsFrequency(String text) {
Map<String, Integer> map = new HashMap<>();
for (int i = 0; i < text.length(); i++) {
String val = text.substring(i, i + 1);
if ("、".equals(val) || "\n".equals(val)) {
continue;
}
map.put(val, map.get(val) == null ? 1 : map.get(val) + 1);
}
return sortMap(map);
}
/**
* 按次数倒序
*
* @param map
* @return
*/
public static Map<String, Integer> sortMap(Map<String, Integer> map) {
//利用Map的entrySet方法转化为list进行排序
List<Map.Entry<String, Integer>> entryList = new ArrayList<>(map.entrySet());
//利用Collections的sort方法对list排序
Collections.sort(entryList, new Comparator<Map.Entry<String, Integer>>() {
@Override
public int compare(Map.Entry<String, Integer> o1, Map.Entry<String, Integer> o2) {
//正序排列,倒序反过来
return o2.getValue() - o1.getValue();
}
});
//遍历排序好的list一定要放进LinkedHashMap因为只有LinkedHashMap是根据插入顺序进行存储
LinkedHashMap<String, Integer> linkedHashMap = new LinkedHashMap<String, Integer>();
for (Map.Entry<String, Integer> e : entryList) {
linkedHashMap.put(e.getKey(), e.getValue());
}
return linkedHashMap;
}
}
```
---
## 4、我的账号
| 平台 | 链接 |
| -------------- | -------------- |
| **项目仓库** | [gitlab]( https://gitlab.com/xuyq123/calligraphy ) &ensp; [coding]( https://xyqin.coding.net/public/my/calligraphy/git ) &ensp; [github]( https://github.com/scott180/calligraphy ) &ensp; [bitbucket]( https://bitbucket.org/xu12345/calligraphy ) &ensp; [gitee]( https://gitee.com/xy180/calligraphy ) &ensp; [sourceforge]( https://sourceforge.net/p/calligraphy/code ) &ensp; [vuepress]( https://scott180.github.io/vuepress-calligraphy ) |
| **资讯账号** | [微信公众号]( https://mp.weixin.qq.com/s/HmdDsCaeumuZg_DfitIdlw ) &ensp; [头条]( https://www.toutiao.com/c/user/token/MS4wLjABAAAA2_bWhiknCbcKNu4c6VTM2B7m2vr7zBrh0x6fSyOrtGU ) &ensp; [豆瓣]( https://www.douban.com/people/80730595/photos ) &ensp; [知乎]( https://www.zhihu.com/people/xu-xian-sheng-72-29/posts ) |
| **个人邮箱** | 1021151991@qq.com |
---