首页
关于
Search
1
阿里云更换SSL证书
308 阅读
2
Nginx+Lua脚本控制负载均衡
176 阅读
3
地址相似度算法
166 阅读
4
【学习笔记】ES新特性
163 阅读
5
原生Hadoop搭建
140 阅读
默认分类
技术
JAVA
python
前端
大数据
运维
登录
Search
标签搜索
学习笔记
Javascript
前端
算法
负载均衡
Lua
Nginx
Kafka
Hive
Hbase
大数据
Hadoop
gitlab
CSS
HTML
ES语法
规则引擎
Drools
模型
springboot
枉自朝朝暮暮
累计撰写
12
篇文章
累计收到
1
条评论
首页
栏目
默认分类
技术
JAVA
python
前端
大数据
运维
页面
关于
搜索到
11
篇与
的结果
2025-05-12
Nginx+Lua脚本控制负载均衡
Nginx+Lua脚本控制负载均衡有时候会根据不同客户需求,提供特有的私有环境,但是接口入口都是统一的情况下,如何将用户请求转发到特定的机器上呢?如果可以根据用户请求的参数进行判断,然后nginx转根据对应参数转发到对应服务器就可以啦。Nginx本身是可以获取到用户请求参数的,就像这样:比如我要获取https://api.example.com?id=testlocation / { echo "appId: $arg_id"; }通过上述配置是可以获取到Get请求中Url所携带的参数,但是Post请求是无法获取的,但是现在大部分api都是Post请求。而且为了更灵活的判断用户参数,再进行相应转发,我们就需要用到Lua脚本啦。Lua脚本配置Lua 是一个由标准 C 语言 开发的、开源的、可扩展的、轻量级的、弱类型的、解释型脚本语言。OpenResty集成了Lua脚本,直接上如何配置。1.设置缓存数据的脚本 cache.lua-- 获取内存数据 local ups_cache = nginx.shared.ups_cache function set_cache(key, value) local success, err = my_cache:set(key, calue) if not success then ngx.log(ngx.ERR, "failed to set cache: ", err) return nil end return true end function get_cache(key) return ups_cache:get(key) end function get_ups() ngx.req.read_body() -- 确保读取整个请求体 local body_data = ngx.req.get_body_data() if not body_data then ngx.status = ngx.HTTP_BAD_REQUEST ngx.say("Failed to read request body") return nil end local json_data = cjson.decode(body_data) if not json_data then ngx.status = ngx.HTTP_BAD_REQUEST ngx.say("Invalid JSON data in request body") return nil end local appId = json_data.appId return get_cache(appId) end return { set_cache = set_cache; get_cache = get_cache; get_ups = get_ups; }2.根据缓存数据获取ups,(get_ups.lua)local cacheUtils= require "cache" local upsValue = cacheUtils.get_ups(); if upsValue then ngx.var.my_ups = upsValue endNginx配置1. 指定内存区域缓存配置数据http { lua_shared_dict ups_cache 128k; }2.通过接口传输配置数据到Nginxserver { listen 80; server_name api.xxx.com; location /updateCache { content_by_lua_block { -- 引入json工具 local cjson = require "cjson" -- 引入cache.lua local cache = require "cache" -- 读取请求体中的 JSON 数据 ngx.req.read_body() -- 确保读取整个请求体 local json_data = ngx.req.get_body_data() if not json_data then ngx.status = ngx.HTTP_BAD_REQUEST ngx.say("Failed to read request body") return end -- 解析 JSON 数据 local data, err = cjson.decode(json_data) if not data then ngx.status = ngx.HTTP_BAD_REQUEST ngx.say("Invalid JSON data") return end local key = data.key local value = cjson.encode(data.value) -- 将 Value 编码为 JSON 字符串存入缓存 -- 将数据存入缓存 if cache.set_cache(key, value) then ngx.status = ngx.HTTP_OK ngx.say("Data cached successfully") else ngx.status = ngx.HTTP_INTERNAL_SERVER_ERROR ngx.say("Failed to cache data") end } } }3.业务接口配置upstream ups_normal { server 192.168.1.1:8080; } upstream ups_test { server 192.168.1.2:8080; } server { #省略 ... location /v1 { proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; proxy_buffering off; #设置默认转发的upstream set $my_ups 'ups_normal'; access_by_lua_file "/opt/get_ups.lua"; proxy_pass http://$my_ups } } 这种方式可以很灵活的通过业务端进行负载均衡的控制,如果觉得Nginx的共享内存不稳定,容易造成配置数据的丢失,还可以对数据的缓存做进一步的处理,比如将数据缓存到redis中,再持久化到数据库中。定时将数据同步至Nginx缓存中,避免redis或者数据库对于接口性能的影响。
2025年05月12日
176 阅读
0 评论
0 点赞
2025-05-11
原生Hadoop搭建
生产环境版本jdk-1.7.0_71SCALA-2.11.8ZOOKEPPER-3.4.6SPARK-2.1.0HIVE-1.2.1HBASE-1.0.2mysql-5.6.33kafka-2.1.0_0.8.2.0集群搭建一.服务器准备1.挂载数据盘(root)数据盘的设备名默认由系统分配,I/O优化实例的数据盘设备名从 /dev/vdb递增排列,包括 /dev/vdb−/dev/vdz。如果数据盘设备名为 dev/xvd*( *是a−z的任意一个字母),表示您使用的是非I/O优化实例。查看数据盘执行命令后,如果不存在 /dev/vdb,表示您的实例没有数据盘。确认数据盘是否已挂载。fdisk -l分区数据盘(一般情况分一个区即可)fdisk -u /dev/vdb p 查看数据盘分区情况 n 创建新分区 p 选择分区类型为主分区 输入分区编号并按回车键。仅创建一个分区,输入1。 输入第一个可用的扇区编号:按回车键采用默认值2048。 输入最后一个扇区编号:仅创建一个分区,按回车键采用默认值。 输入p:查看该数据盘的规划分区情况。 输入w:开始分区,并在分区后退出。查看新分区fdisk -lu /dev/vdb ----------------------------------------------------------- Disk /dev/vdb: 21.5 GB, 21474836480 bytes, 41943040 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk label type: dos Disk identifier: 0x3e60020e Device Boot Start End Blocks Id System /dev/vdb1 2048 41943039 20970496 83 Linux在新分区创建文件系统如果需要在 Linux、Windows和Mac系统之间共享文件,可以使用mkfs.vfat创建VFAT文件系统。mkfs.ext4 /dev/vdb1备份etc/fstab文件cp /etc/fstab /etc/fstab.bak向etc/fstab写入新分区信息echo /dev/vdb1 /mnt ext4 defaults 0 0 >> /etc/fstab查看新分区信息cat /etc/fstab挂载文件系统mount /dev/vdb1/ /app #若需要卸载文件系统可执行以下命令: umount /app查看磁盘使用情况若出现新建文件系统信息,则挂载成功df -h2.创建用户(root用户执行)#创建用户主目录 useradd -d /app -m app passwd hadoop3.修改主机名(root用户执行)vim /etc/sysconfig/network NETWORKING=yes HOSTNAME=hadoop034.修改hosts文件(root用户执行)vim /etc/hosts 10.0.0.99 hadoop01 10.0.0.100 hadoop02 10.0.0.101 hadoop03 10.0.0.102 hadoop04修改完以上文件后 reboot 重启5.配置ssh免密登录生成rsa秘钥ssh-keygen -t rsa 拷贝秘钥至其他服务器上在一台服务器上配置好所有服务器的公钥,然后复制到其他服务器即可,本机的公钥也需要scp .ssh/id_rsa.pub hadoop@hadoop02:/app/hadoop/id_rsa.pub cat id_rsa.pub >> ~/.ssh/authorized_keys修改文件夹权限1.chmod 700 -R ~/.ssh其他方式ssh-copy-id -i ~/.ssh/id_rsa.pub app@192.168.1.233二.JDK与SCALA环境搭建1.复制JDK包与SCALA包并解压scp jdk-1.7.0_71.tar hadoop@10.0.0.99:/app/java tar -xvf jdk-1.7.0_71.tar2.配置环境变量vim /etc/profile #并输入以下内容 export JAVA_HOME=/app/java/jdk1.7.0_71 export JRE_HOME=/app/java/jdk1.7.0_71/jre export SCALA_HOME=/app/scala/scala-2.11.8 export CLASSPATH=$CLASSPATH:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar export PATH=$JAVA_HOME/bin:$JRE_HOME/bin:$SCALA_HOME/bin:$PATH 三.zookeeper集群搭建1.zookeeper包下载,解压tar -xvf zookeeper-3.4.6.tar2.创建data和logs目录mkdir data mkdir logs3.配置zoo.cfgcp zoo_sample.cfg zoo.cfg #修改zoo.cfg # The number of milliseconds of each tick tickTime=2000 # The number of ticks that the initial # synchronization phase can take initLimit=10 # The number of ticks that can pass between # sending a request and getting an acknowledgement syncLimit=5 # the directory where the snapshot is stored. # do not use /tmp for storage, /tmp here is just # example sakes. # the port at which the clients will connect clientPort=2181 # the maximum number of client connections. # increase this if you need to handle more clients # The number of milliseconds of each tick tickTime=2000 # The number of ticks that the initial # synchronization phase can take initLimit=10 # The number of ticks that can pass between # sending a request and getting an acknowledgement syncLimit=5 # the directory where the snapshot is stored. # do not use /tmp for storage, /tmp here is just # example sakes. dataLogDir=/app/hadoop/zookeeper3.4.6/logs dataDir=/app/hadoop/zookeeper3.4.6/data # the port at which the clients will connect clientPort=2181 # the maximum number of client connections. # increase this if you need to handle more clients maxClientCnxns=500 # # Be sure to read the maintenance section of the # administrator guide before turning on autopurge. # # http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance # # The number of snapshots to retain in dataDir autopurge.snapRetainCount=3 # Purge task interval in hours # Set to "0" to disable auto purge feature autopurge.purgeInterval=24 server.1=hadoop01:2888:3888 server.2=hadoop02:2888:3888 server.3=hadoop03:2888:3888 # 心跳间隔时间 tickTime=2000 # 最小SessionTimeou minSessionTimeout=4000 # 最大SessionTimeou maxSessionTimeout=100000 4.创建data/myid文件echo 1 >> myid5.拷贝文件至其他服务器scp -r zookeeper3.4.6/ hadoop@hadoop02:/app/data/6.修改各服务器myid文件修改为服务器对应server.n 的n值7.启动zk./zkServer.sh start #集群启动需要所有服务都启动完,zk状态才会正常8.开机自启动#待完成9.集群数量单数原因容错由于在增删改操作中需要半数以上服务器通过,来分析以下情况。2台服务器,至少2台正常运行才行(2的半数为1,半数以上最少为2),正常运行1台服务器都不允许挂掉3台服务器,至少2台正常运行才行(3的半数为1.5,半数以上最少为2),正常运行可以允许1台服务器挂掉4台服务器,至少3台正常运行才行(4的半数为2,半数以上最少为3),正常运行可以允许1台服务器挂掉5台服务器,至少3台正常运行才行(5的半数为2.5,半数以上最少为3),正常运行可以允许2台服务器挂掉6台服务器,至少3台正常运行才行(6的半数为3,半数以上最少为4),正常运行可以允许2台服务器挂掉 通过以上可以发现,3台服务器和4台服务器都最多允许1台服务器挂掉,5台服务器和6台服务器都最多允许2台服务器挂掉但是明显4台服务器成本高于3台服务器成本,6台服务器成本高于5服务器成本。这是由于半数以上投票通过决定的。防脑裂一个zookeeper集群中,可以有多个follower、observer服务器,但是必需只能有一个leader服务器。如果leader服务器挂掉了,剩下的服务器集群会通过半数以上投票选出一个新的leader服务器。集群互不通讯情况:一个集群3台服务器,全部运行正常,但是其中1台裂开了,和另外2台无法通讯。3台机器里面2台正常运行过半票可以选出一个leader。一个集群4台服务器,全部运行正常,但是其中2台裂开了,和另外2台无法通讯。4台机器里面2台正常工作没有过半票以上达到3,无法选出leader正常运行。一个集群5台服务器,全部运行正常,但是其中2台裂开了,和另外3台无法通讯。5台机器里面3台正常运行过半票可以选出一个leader。一个集群6台服务器,全部运行正常,但是其中3台裂开了,和另外3台无法通讯。6台机器里面3台正常工作没有过半票以上达到4,无法选出leader正常运行。 ### 四.Hadoop集群搭建1.下载并拷贝包至服务器scp hadoop-2.6.0.tat.gz hadoop@10.0.0.99:/app/hadoop2.创建数据存放目录mkdir -p app/data/hadoop/dfs/tmp mkdir -p app/data/hadoop/dfs/data mkdir -p app/data/hadoop/dfs/journal mkdir -p app/data/hadoop/dfs/name3.修改hadoop-env.sh#修改java环境变量 export JAVA_HOME=/app/java/jdk1.7.0_714.修改core-site.xml<?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. See accompanying LICENSE file. --> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <!--指定tmp存放目录--> <name>hadoop.tmp.dir</name> <value>/app/data/hadoop/dfs/tmp</value> <description>Abaseforothertemporarydirectories.</description> </property> <property> <name>fs.default.name</name> <value>hdfs://hadoop-cluster1</value> </property> <property> <name>io.file.buffer.size</name> <value>4096</value> </property> <property> <name>io.compression.codecs</name> <value>org.apache.hadoop.io.compress.GzipCodec, org.apache.hadoop.io.compress.DefaultCodec, org.apache.hadoop.io.compress.BZip2Codec, org.apache.hadoop.io.compress.SnappyCodec </value> </property> <!--指定zookeeper地址--> <property> <name>ha.zookeeper.quorum</name> <value>hadoop01:2181,hadoop02:2181,hadoop03:2181</value> </property> </configuration> 5.修改hdfs-site.xml<?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. See accompanying LICENSE file. --> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>dfs.namenode.secondary.http-address</name> <value>hadoop01:9001</value> </property> <property> <name>dfs.namenode.name.dir</name> <value>file:/app/data/hadoop/dfs/name</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>file:/app/data/hadoop/dfs/data</value> </property> <property> <name>dfs.replication</name> <value>3</value> </property> <property> <name>dfs.nameservices</name> <value>hadoop-cluster1</value> </property> <property> <name>dfs.webhdfs.enabled</name> <value>true</value> </property> <property> <name>dfs.hosts.exclude</name> <value>/app/hadoop/hadoop-2.6.0/etc/hadoop/excludes</value> </property> <property> <name>dfs.ha.namenodes.hadoop-cluster1</name> <value>nn1,nn2</value> </property> <property> <name>dfs.namenode.rpc-address.hadoop-cluster1.nn1</name> <value>hadoop01:9000</value> </property> <property> <name>dfs.namenode.http-address.hadoop-cluster1.nn1</name> <value>hadoop01:50070</value> </property> <property> <name>dfs.namenode.rpc-address.hadoop-cluster1.nn2</name> <value>hadoop02:9000</value> </property> <property> <name>dfs.namenode.http-address.hadoop-cluster1.nn2</name> <value>hadoop02:50070</value> </property> <property> <name>dfs.namenode.shared.edits.dir</name> <value>qjournal://hadoop02:8485;hadoop03:8485;hadoop04:8485/hadoop-cluster1</value> </property> <property> <name>dfs.client.failover.proxy.provider.hadoop-cluster1</name> <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value> </property> <property> <name>dfs.ha.fencing.methods</name> <value>sshfence</value> </property> <property> <name>dfs.ha.fencing.ssh.private-key-files</name> <value>/app/hadoop/.ssh/id_rsa</value> </property> <property> <name>dfs.journalnode.edits.dir</name> <value>/app/data/hadoop/dfs/journal</value> </property> <!-- 开启NameNode故障时自动切换 --> <property> <name>dfs.ha.automatic-failover.enabled</name> <value>true</value> </property> </configuration> 6.修改mapred-site.xmlcp etc/hadoop/mapred-site.xml.template etc/hadoop/mapred-site.xml vim mapred-site.xml<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. See accompanying LICENSE file. --> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> <property> <name>mapreduce.jobhistory.address</name> <value>hadoop01:10020</value> </property> <property> <name>mapreduce.jobhistory.webapp.address</name> <value>hadoop01:19888</value> </property> <property> <name>mapreduce.map.output.compress</name> <value>true</value> </property> <property> <name>mapreduce.map.output.compress.codec</name> <value>org.apache.hadoop.io.compress.SnappyCodec</value> </property> </configuration> 7.修改yarn-site.xml<?xml version="1.0"?> <!-- Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. See accompanying LICENSE file. --> <configuration> <!-- Site specific YARN configuration properties --> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> <property> <name>yarn.resourcemanager.address</name> <value>hadoop01:8032</value> </property> <property> <name>yarn.resourcemanager.scheduler.address</name> <value>hadoop01:8030</value> </property> <property> <name>yarn.resourcemanager.resource-tracker.address</name> <value>hadoop01:8031</value> </property> <property> <name>yarn.resourcemanager.admin.address</name> <value>hadoop01:8033</value> </property> <property> <name>yarn.resourcemanager.webapp.address</name> <value>hadoop01:8088</value> </property> </configuration> 8.修改环境变量vim /etc/profileexport HADOOP_COMMON_HOME=/app/hadoop/hadoop-2.6.0 export HADOOP_HOME=/app/hadoop/hadoop-2.6.0 export HADOOP_CONF_DIR=/app/hadoop/hadoop-2.6.0/etc/hadoop export YARN_CONF_DIR=/app/hadoop/hadoop-2.6.0/etc/hadoop export HADOOP_LOG_DIR=$HADOOP_HOME/logs export YARN_LOG_DIR=$HADOOP_LOG_DIR export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib" export CLASSPATH=$CLASSPATH:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar export PATH=$JAVA_HOME/bin:$JRE_HOME/bin:$SCALA_HOME/bin:$MAVEN_HOME/bin:$ZK_HOME/bin:$HADOOP_COMMON_HOME/bin:$HADOOP_COMMON_HOME/sbin:$PATH9.分发hadoop包至其他节点scp -r hadoop-2.6.0 hadoop@hadoop02:/app/hadoop/ scp -r hadoop-2.6.0 hadoop@hadoop03:/app/hadoop/ scp -r hadoop-2.6.0 hadoop@hadoop04:/app/hadoop/10.启动journalnode节点./sbin/hadoop-daemon.sh start journalnode11.格式化zkfc/bin/hdfs zkfc -formatZK12.格式化namenode./bin/hdfs namenode -format13.启动datanode./sbin/hadoop-daemon.sh start datanode14.启动namenode#namenode1 ./sbin/hadoop-daemon.sh start namenode #namenode2 ./bin/hdfs namenode -bootstrapStandby ./sbin/hadoop-daemon.sh start namenode15.启动yarn./start-yarn.sh16.查看集群状态http://10.0.0.99:5007017.待解决问题snappy库支持问题. #安装gcc环境 yum install -y gcc-c++五.kafka集群搭建1.下载kafka包并上传至服务器http://kafka.apache.org/downloads.html2.解压tar包tar -xvf kafka_2.10-0.8.2.0.tgz3.修改配置文件server.properties# Licensed to the Apache Software Foundation (ASF) under one or more # contributor license agreements. See the NOTICE file distributed with # this work for additional information regarding copyright ownership. # The ASF licenses this file to You under the Apache License, Version 2.0 # (the "License"); you may not use this file except in compliance with # the License. You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. # see kafka.server.KafkaConfig for additional details and defaults ############################# Server Basics ############################# # The id of the broker. This must be set to a unique integer for each broker. broker.id=1 ############################# Socket Server Settings ############################# # The port the socket server listens on port=9092 # Hostname the broker will bind to. If not set, the server will bind to all interfaces #host.name=localhost # Hostname the broker will advertise to producers and consumers. If not set, it uses the # value for "host.name" if configured. Otherwise, it will use the value returned from # java.net.InetAddress.getCanonicalHostName(). #advertised.host.name=<hostname routable by clients> # The port to publish to ZooKeeper for clients to use. If this is not set, # it will publish the same port that the broker binds to. #advertised.port=<port accessible by clients> # The number of threads handling network requests num.network.threads=3 # The number of threads doing disk I/O num.io.threads=8 # The send buffer (SO_SNDBUF) used by the socket server socket.send.buffer.bytes=102400 # The receive buffer (SO_RCVBUF) used by the socket server socket.receive.buffer.bytes=102400 # The maximum size of a request that the socket server will accept (protection against OOM) socket.request.max.bytes=104857600 ############################# Log Basics ############################# # A comma seperated list of directories under which to store log files log.dirs=/app/hadoop/kafka_2.10-0.8.2.0/logs # The default number of log partitions per topic. More partitions allow greater # parallelism for consumption, but this will also result in more files across # the brokers. num.partitions=6 # The number of threads per data directory to be used for log recovery at startup and flushing at shutdown. # This value is recommended to be increased for installations with data dirs located in RAID array. num.recovery.threads.per.data.dir=1 ############################# Log Flush Policy ############################# # Messages are immediately written to the filesystem but by default we only fsync() to sync # the OS cache lazily. The following configurations control the flush of data to disk. # There are a few important trade-offs here: # 1. Durability: Unflushed data may be lost if you are not using replication. # 2. Latency: Very large flush intervals may lead to latency spikes when the flush does occur as there will be a lot of data to flush. # 3. Throughput: The flush is generally the most expensive operation, and a small flush interval may lead to exceessive seeks. # The settings below allow one to configure the flush policy to flush data after a period of time or # every N messages (or both). This can be done globally and overridden on a per-topic basis. # The number of messages to accept before forcing a flush of data to disk #log.flush.interval.messages=10000 # The maximum amount of time a message can sit in a log before we force a flush #log.flush.interval.ms=1000 ############################# Log Retention Policy ############################# # The following configurations control the disposal of log segments. The policy can # be set to delete segments after a period of time, or after a given size has accumulated. # A segment will be deleted whenever *either* of these criteria are met. Deletion always happens # from the end of the log. # The minimum age of a log file to be eligible for deletion log.retention.hours=48 # A size-based retention policy for logs. Segments are pruned from the log as long as the remaining # segments don't drop below log.retention.bytes. #log.retention.bytes=1073741824 # The maximum size of a log segment file. When this size is reached a new log segment will be created. log.segment.bytes=1073741824 # The interval at which log segments are checked to see if they can be deleted according # to the retention policies log.retention.check.interval.ms=300000 # By default the log cleaner is disabled and the log retention policy will default to just delete segments after their retention expires. # If log.cleaner.enable=true is set the cleaner will be enabled and individual logs can then be marked for log compaction. log.cleaner.enable=false ############################# Zookeeper ############################# # Zookeeper connection string (see zookeeper docs for details). # This is a comma separated host:port pairs, each corresponding to a zk # server. e.g. "127.0.0.1:3000,127.0.0.1:3001,127.0.0.1:3002". # You can also append an optional chroot string to the urls to specify the # root directory for all kafka znodes. zookeeper.connect=hadoop01:2181,hadoop02:2181,hadoop03:2181 # Timeout in ms for connecting to zookeeper zookeeper.connection.timeout.ms=6000 4.复制包至其他节点scp -r kafka-2.10_0.8.2.0 hadoop@hadoop03:/app/hadoop5.启动各服务器kafka./kafka-server-start.sh ../config/server.properties &6.验证集群#创建一个测试topic ./kafka-topics.sh --create --zookeeper hadoop01:2181 --replication-factor 3 --partitions 1 --topic wxtest #启动一台kafka hadoop04 的producer并往hadoop03推送数据 ./kafka-console-producer.sh --broker-list hadoop03:9092 --topic wxtest test for hadoop03 #停掉hadoop03的kafka ./kafka-server-stop.sh #在hadoop02节点启动consumer,看是否接收到hadoop04推送的数据 ./kafka-console-consumer.sh --zookeeper hadoop01:2181 --topic wxtest --from-beginning7.待解决问题kafka启动后,hadoop02 zk挂掉的问题六.安装mysql./mysql_install_db --verbose --user=hadoop --defaults-file=/app/hadoop/mysql-5.6.33-linux-glibc2.5-x86_64/my.cnf --datadir=/app/data/mysql/data/ --basedir=/app/hadoop/mysql-5.6.33-linux-glibc2.5-x86_64 --pid-file=/app/data/mysql/data/mysql.pid --tmpdir=/app/data/mysql/tmp cp support-files/mysql.server /etc/init.d/mysql./mysqld_safe --defaults-file=/etc/my.cnf --socket=/app/data/mysql/tmp/mysql.sock --user=hadoop./mysql -h localhost -S /app/data/mysql/tmp/mysql.sock -u root -pGRANT ALL PRIVILEGES ON *.* TO 'root'@'%' IDENTIFIED BY 'rootbqs123' WITH GRANT OPTION;create database hive; alter database hive character set latin1;七.hive搭建1.下载hive1包,上传至服务器并解压apache-hive- 1.2.1-bin.tar.gzhttp://hive.apache.org/downloads.htmltar -xvf apache-hive-1.2.1-bin.tar.gz2.修改hive-env.sh文件注:hive-env.sh初始时没有,需要复制hive-env.sh.template文件cp hive-env.sh.template hive-env.sh vim hive-env.sh3.修改hive-site.xml文件4.启动hivemetastore#-p参数若不指定,默认为9083端口 hive --service metastore -p <port_num> #客户端使用hive命令进入 hive5.注意事项#启动时报错:/tmp/hive on HDFS should be writable. Current permissions are: rwx--x--x # 当前用户在hdfs无权限写入数据~解决方式 hadoop fs -chmod -R 777 /tmp 八.HBase搭建hbase采用分布式集群搭建,节点情况如下hadoop01hadoop02hadoop03hadoop04HmasterHmaster regionserverregionserverregionserver1.上传hbase-1.0.2-bin.tar.gz包至服务器下载地址:http://archive.apache.org/dist/hbase/hbase-1.0.2/tar -xvf hbase-1.0.2-bin.tar.gz2.修改hbase-env.sh文件# #/** # * Licensed to the Apache Software Foundation (ASF) under one # * or more contributor license agreements. See the NOTICE file # * distributed with this work for additional information # * regarding copyright ownership. The ASF licenses this file # * to you under the Apache License, Version 2.0 (the # * "License"); you may not use this file except in compliance # * with the License. You may obtain a copy of the License at # * # * http://www.apache.org/licenses/LICENSE-2.0 # * # * Unless required by applicable law or agreed to in writing, software # * distributed under the License is distributed on an "AS IS" BASIS, # * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # * See the License for the specific language governing permissions and # * limitations under the License. # */ # Set environment variables here. # This script sets variables multiple times over the course of starting an hbase process, # so try to keep things idempotent unless you want to take an even deeper look # into the startup scripts (bin/hbase, etc.) # The java implementation to use. Java 1.7+ required. export JAVA_HOME=/app/java/jdk1.7.0_71/ # Extra Java CLASSPATH elements. Optional. # export HBASE_CLASSPATH= # The maximum amount of heap to use. Default is left to JVM default. export HBASE_HEAPSIZE=1G export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$HADOOP_HOME/lib/native/ export HBASE_LIBRARY_PATH=$HBASE_LIBRARY_PATH:$HBASE_HOME/lib/native/ # Uncomment below if you intend to use off heap cache. For example, to allocate 8G of # offheap, set the value to "8G". # export HBASE_OFFHEAPSIZE=1G # Extra Java runtime options. # Below are what we set by default. May only work with SUN JVM. # For more on why as well as other possible settings, # see http://wiki.apache.org/hadoop/PerformanceTuning export HBASE_OPTS="$HBASE_OPTS -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=80 -XX:+UseCMSInitiatingOccupancyOnly" # Uncomment one of the below three options to enable java garbage collection logging for the server-side processes. # This enables basic gc logging to the .out file. # export SERVER_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps" # This enables basic gc logging to its own file. # If FILE-PATH is not replaced, the log file(.gc) would still be generated in the HBASE_LOG_DIR . # export SERVER_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:<FILE-PATH>" # This enables basic GC logging to its own file with automatic log rolling. Only applies to jdk 1.6.0_34+ and 1.7.0_2+. # If FILE-PATH is not replaced, the log file(.gc) would still be generated in the HBASE_LOG_DIR . export SERVER_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:<FILE-PATH> -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=1 -XX:GCLogFileSize=512M" # Uncomment one of the below three options to enable java garbage collection logging for the client processes. # This enables basic gc logging to the .out file. # export CLIENT_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps" # This enables basic gc logging to its own file. # If FILE-PATH is not replaced, the log file(.gc) would still be generated in the HBASE_LOG_DIR . # export CLIENT_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:<FILE-PATH>" # This enables basic GC logging to its own file with automatic log rolling. Only applies to jdk 1.6.0_34+ and 1.7.0_2+. # If FILE-PATH is not replaced, the log file(.gc) would still be generated in the HBASE_LOG_DIR . # export CLIENT_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:<FILE-PATH> -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=1 -XX:GCLogFileSize=512M" # See the package documentation for org.apache.hadoop.hbase.io.hfile for other configurations # needed setting up off-heap block caching. # Uncomment and adjust to enable JMX exporting # See jmxremote.password and jmxremote.access in $JRE_HOME/lib/management to configure remote password access. # More details at: http://java.sun.com/javase/6/docs/technotes/guides/management/agent.html # NOTE: HBase provides an alternative JMX implementation to fix the random ports issue, please see JMX # section in HBase Reference Guide for instructions. # export HBASE_JMX_BASE="-Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false" # export HBASE_MASTER_OPTS="$HBASE_MASTER_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10101" # export HBASE_REGIONSERVER_OPTS="$HBASE_REGIONSERVER_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10102" # export HBASE_THRIFT_OPTS="$HBASE_THRIFT_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10103" # export HBASE_ZOOKEEPER_OPTS="$HBASE_ZOOKEEPER_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10104" # export HBASE_REST_OPTS="$HBASE_REST_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10105" # File naming hosts on which HRegionServers will run. $HBASE_HOME/conf/regionservers by default. # export HBASE_REGIONSERVERS=${HBASE_HOME}/conf/regionservers # Uncomment and adjust to keep all the Region Server pages mapped to be memory resident #HBASE_REGIONSERVER_MLOCK=true #HBASE_REGIONSERVER_UID="hbase" # File naming hosts on which backup HMaster will run. $HBASE_HOME/conf/backup-masters by default. export HBASE_BACKUP_MASTERS=${HBASE_HOME}/conf/backup-masters # Extra ssh options. Empty by default. # export HBASE_SSH_OPTS="-o ConnectTimeout=1 -o SendEnv=HBASE_CONF_DIR" # Where log files are stored. $HBASE_HOME/logs by default. # export HBASE_LOG_DIR=${HBASE_HOME}/logs # Enable remote JDWP debugging of major HBase processes. Meant for Core Developers # export HBASE_MASTER_OPTS="$HBASE_MASTER_OPTS -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8070" # export HBASE_REGIONSERVER_OPTS="$HBASE_REGIONSERVER_OPTS -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8071" # export HBASE_THRIFT_OPTS="$HBASE_THRIFT_OPTS -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8072" # export HBASE_ZOOKEEPER_OPTS="$HBASE_ZOOKEEPER_OPTS -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8073" # A string representing this instance of hbase. $USER by default. # export HBASE_IDENT_STRING=$USER # The scheduling priority for daemon processes. See 'man nice'. # export HBASE_NICENESS=10 # The directory where pid files are stored. /tmp by default. export HBASE_PID_DIR=/app/hadoop/hbase-1.0.2/pids # Seconds to sleep between slave commands. Unset by default. This # can be useful in large clusters, where, e.g., slave rsyncs can # otherwise arrive faster than the master can service them. # export HBASE_SLAVE_SLEEP=0.1 # Tell HBase whether it should manage it's own instance of Zookeeper or not. export HBASE_MANAGES_ZK=false # The default log rolling policy is RFA, where the log file is rolled as per the size defined for the # RFA appender. Please refer to the log4j.properties file to see more details on this appender. # In case one needs to do log rolling on a date change, one should set the environment property # HBASE_ROOT_LOGGER to "<DESIRED_LOG LEVEL>,DRFA". # For example: # HBASE_ROOT_LOGGER=INFO,DRFA # The reason for changing default to RFA is to avoid the boundary case of filling out disk space as # DRFA doesn't put any cap on the log size. Please refer to HBase-5655 for more context. 3.修改hbase-site.xml4.修改regionservers文件写入需要作为从节点的服务器hadoop02 hadoop03 hadoop045.创建backup-masters文件并写入备份主节点hadoop026.链接hadoop的配置ln -s /app/hadoop/hadoop2.6.0/etc/hadoop/hdfs-site.xml /app/hadoop/hbase-1.0.2/conf/ ln -s /app/hadoop/hadoop2.6.0/etc/hadoop/core-site.xml /app/hadoop/hbase-1.0.2/conf/7.拷贝包至其他服务器scp -r /app/hadoop/hbase-1.0.2/ hadoop@hadoop02:/app/hadoop/ scp -r /app/hadoop/hbase-1.0.2/ hadoop@hadoop03:/app/hadoop/ scp -r /app/hadoop/hbase-1.0.2/ hadoop@hadoop04:/app/hadoop/8.配置环境变量vim /etc/profile #添加以下配置 export HBASE_HOME=/home/hadoop/hbase-1.0.2 export PATH=$HABSE_HOME/bin:$PATH9.启动服务方式一:start-hbase.sh #启动日志如下: starting master, logging to /app/hadoop/hbase-1.0.2/logs/hbase-hadoop-master-hadoop01.out hadoop04: starting regionserver, logging to /app/hadoop/hbase-1.0.2/bin/../logs/hbase-hadoop-regionserver-hadoop04.out hadoop02: starting regionserver, logging to /app/hadoop/hbase-1.0.2/bin/../logs/hbase-hadoop-regionserver-hadoop02.out hadoop03: starting regionserver, logging to /app/hadoop/hbase-1.0.2/bin/../logs/hbase-hadoop-regionserver-hadoop03.out hadoop02: starting master, logging to /app/hadoop/hbase-1.0.2/bin/../logs/hbase-hadoop-master-hadoop02.out方式二:#启动主节点 hbase-daemon.sh start master #启动从节点 hbase-daemon.sh start regionserver10.问题启动异常java.lang.RuntimeException: Failed construction of Regionserver: class org.apache.hadoop.hbase.regionserver.HRegionServer at org.apache.hadoop.hbase.regionserver.HRegionServer.constructRegionServer(HRegionServer.java:2523) at org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine.start(HRegionServerCommandLine.java:64) at org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine.run(HRegionServerCommandLine.java:87) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:126) at org.apache.hadoop.hbase.regionserver.HRegionServer.main(HRegionServer.java:2538) Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.hadoop.hbase.regionserver.HRegionServer.constructRegionServer(HRegionServer.java:2521) ... 5 more Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.hbase.ipc.PhoenixRpcSchedulerFactory not found at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1905) at org.apache.hadoop.hbase.regionserver.RSRpcServices.<init>(RSRpcServices.java:769) at org.apache.hadoop.hbase.regionserver.HRegionServer.createRpcServices(HRegionServer.java:575) at org.apache.hadoop.hbase.regionserver.HRegionServer.<init>(HRegionServer.java:492) ... 10 more Caused by: java.lang.ClassNotFoundException: Class org.apache.hadoop.hbase.ipc.PhoenixRpcSchedulerFactory not found at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1811) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1903) ... 13 more 原因: hbase-site配置了phoenix相关内容,但lib目录无相关jar包导致解决办法:1.去除Phoenix相关配置2.拷贝Phoenix相关jar包至各目录本次通过方法1成功启动, phoenix相关配置后续加入11.hive整合hbase因为Hive与HBase整合的实现是利用两者本身对外的API接口互相通信来完成的,其具体工作交由Hive的lib目录中的hive-hbase-handler- .jar工具类来实现。所以只需要将hive的 hive-hbase-handler- .jar 复制到hbase/lib中就可以了。#拷贝至本机目录下 cp /app/hadoop/apache-hive-1.2.1-bin/lib/hive-hbase-handler-1.2.1.jar /app/hadoop/hbase-1.0.2/lib/ #拷贝至其他主机目录下 cd /app/hadoop/apache-hive-1.2.1-bin/lib scp hive-hbase-handler-1.2.1.jar hadoop@hadoop02:/app/hadoop/hbase-1.0.2/lib/测试整合效果通过不同主机分别进入hive和hbasehive hbase shell #在hive创建表在hbase创建表'wx_test_hive_hbase' create 'wx_test_hive_hbase','INFO'在hive创建表'hive_wx_test_hive_hbase'create external table hive_wx_test_hive_hbase( id string, area_code string, area_desc string ) stored by 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,INFO:areaCode,INFO:areaDesc") TBLPROPERTIES("hbase.table.name" = "wx_test_hive_hbase"); create external table hive_wx_test( id string, area_code string ) stored by 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,INFO:areaCode") TBLPROPERTIES("hbase.table.name" = "wx_test");在hive和hbase分别插入不同数据#hive #hbase put 'wx_test_hive_hbase','00001','INFO:area_code','0001','INFO:area_desc','深圳'九.spark集群搭建1.下载spark并上传服务器https://archive.apache.org/dist/spark/spark-2.1.0/2.修改配置文件spark-env.sh注:原始包里只有spark-env.sh.template文件,需要拷贝一份为spark-env.sh文件cp spark-env.sh.template spark-env.sh #修改配置如下: 3.修改u
2025年05月11日
140 阅读
0 评论
0 点赞
2024-01-02
Gitlab修改用户密码
Gitlab修改用户密码打开rails控制台,并在控制台中执行2-5的步骤。sudo gitlab-rails console根据用户名或者用户ID找到user;根据用户名user = User.find_by_username 'yourusername'根据用户IDuser = User.find(123)根据邮箱地址:user = User.find_by(email: 'user@example.com')重设密码new_password = 'examplepassword' #or new_password = ::User.random_password user_password = new_password user.password_confirmation = new_password邮件通知user.send_only_admin_changed_your_password_notification!保存并退出user.save! exit如果还是不行,则重启下gitlabgitlab-ctl restart
2024年01月02日
128 阅读
0 评论
0 点赞
2023-08-01
地址相似度算法
地址相似度算法给到两个中文的地址,如何尽可能准确的获取这两个地址的相似度以判断这两个地址是否接近或一致?一般思路会有两种。一种是直接计算两个字符串的相似度,这种情况可能会因为模糊内容越多导致相似度计算不准确的情况,比如两个地区的同一个名字的街道,实际上在地理层面却相聚甚远。一种就是直接通过该地址的GPS数据进行距离计算,如果经纬度相近,则可以认为这两个地址是相近,但无法确认它们是同一个地址,除非经纬度基本一致。为了合理的判断两个地址的相似度以及地理上的相近情况。可以结合上面两种算法,共同计算相似度结果。对于字符串相似度算法有很多,比如余弦相似度、矩阵相似度、字符串编辑距离等等算法。本文会采用编辑距离算法中的Jaro-Winkler算法。1.编辑距离算法public class JaroWinklerDistance { private static final float threshold = 0.8f; public static float getDegree(String s1, String s2) { int[] mtp = matches(s1, s2); // 返回匹配数目(m) float m = (float) mtp[0]; if (m == 0) { return 0f; } // Jaro Distance float j = ((m / s1.length() + m / s2.length() + (m - mtp[1]) / m)) / 3; // 计算Jaro-Winkler Distance, 这里调整分数的因数=Math.min(0.1f, 1f / mtp[3]) float jw = j < threshold ? j : j + Math.min(0.1f, 1f / mtp[3]) * mtp[2] * (1 - j); return jw; } private static int[] matches(String s1, String s2) { String max, min; if (s1.length() > s2.length()) { max = s1; min = s2; } else { max = s2; min = s1; } // 两个分别来自s1和s2的字符如果相距不超过 floor(max(|s1|,|s2|) / 2) -1, 我们就认为这两个字符串是匹配的, // 因此,查找时, // 超过此距离则停止 int range = Math.max(max.length() / 2 - 1, 0); // 短的字符串, 与长字符串匹配的索引位 int[] matchIndexes = new int[min.length()]; Arrays.fill(matchIndexes, -1); // 长字符串匹配的标记 boolean[] matchFlags = new boolean[max.length()]; // 匹配的数目 int matches = 0; // 外层循环,字符串最短的开始 for (int mi = 0; mi < min.length(); mi++) { char c1 = min.charAt(mi); // 可能匹配的距离,包括从给定位置从前查找和从后查找 for (int xi = Math.max(mi - range, 0), xn = Math.min(mi + range + 1, max.length()); xi < xn; xi++) { // 排除被匹配过的字符,若找到匹配的字符,则停止 if (!matchFlags[xi] && c1 == max.charAt(xi)) { matchIndexes[mi] = xi; matchFlags[xi] = true; matches++; break; } } } // 记录min字符串里匹配的字符串,保持顺序 char[] ms1 = new char[matches]; // 记录max字符串里匹配的字符串,保持顺序 char[] ms2 = new char[matches]; for (int i = 0, si = 0; i < min.length(); i++) { if (matchIndexes[i] != -1) { ms1[si] = min.charAt(i); si++; } } for (int i = 0, si = 0; i < max.length(); i++) { if (matchFlags[i]) { ms2[si] = max.charAt(i); si++; } } // 查找换位的数目 int transpositions = 0; for (int mi = 0; mi < ms1.length; mi++) { if (ms1[mi] != ms2[mi]) { transpositions++; } } // 查找相同前缀的数目 int prefix = 0; for (int mi = 0; mi < min.length(); mi++) { if (s1.charAt(mi) == s2.charAt(mi)) { prefix++; } else { break; } } // 返回匹配数目(m),换位的数目(t),相同的前缀的数目,字符串最长 return new int[] { matches, transpositions / 2, prefix, max.length() }; } }2.GPS距离计算根据中文地址获取GPS数据,可以通过地址库或者其他厂商的API获得,需要地址的具体的经度和纬度数据,再根据经纬度计算出两地相距的距离。根据距离给定一个相似度。public class GpsUtils { /** * 地球半径 */ private static final double EARTH_RADIUS = 6378137; private static double rad(String d) { return Double.valueOf(d) * Math.PI / 180.0; } public static double distance(Location location1, Location location2) { if (location1 != null && location2 != null) { return distance(location1.getLng(), location1.getLat(), location2.getLng(), location2.getLat()); } return Integer.MAX_VALUE; } /** * 根据两点间经纬度坐标(double值),计算两点间距离,单位为米 * * @param lng1 * @param lat1 * @param lng2 * @param lat2 * @return */ public static double distance(String lng1, String lat1, String lng2, String lat2) { double radLat1 = rad(lat1); double radLat2 = rad(lat2); double a = radLat1 - radLat2; double b = rad(lng1) - rad(lng2); double s = 2 * Math.asin(Math.sqrt( Math.pow(Math.sin(a / 2), 2) + Math.cos(radLat1) * Math.cos(radLat2) * Math.pow(Math.sin(b / 2), 2))); s = s * EARTH_RADIUS; s = Math.round(s * 10000) / 10000; return s; } /** * 根据距离计算匹配度(参考) * * @param distance * @return */ public static float getDegreeByDistance(double distance) { float matchedDegree = 0f; if (distance > 10000) { matchedDegree = 0.0f; //( >10000) 0 } else if (distance > 5000) { matchedDegree = 0.1f; //(5000~10000] 0.1 } else if (distance > 2000) { matchedDegree = 0.2f; //(2000~5000] 0.2 } else if (distance > 1000) { matchedDegree = 0.3f; //(1000~2000] 0.3 } else if (distance > 500) { matchedDegree = 0.4f; //(500~1000] 0.4 } else if (distance > 200) { matchedDegree = 0.5f; //(200~500] 0.5 } else if (distance > 100) { matchedDegree = 0.6f; //(100~200] 0.6 } else if (distance > 50) { matchedDegree = 0.7f; //(50~100] 0.7 } else if (distance > 20) { matchedDegree = 0.8f;//(20~50] 0.8 } else if (distance > 10) { matchedDegree = 0.9f; //(10~20] 0.9 } else { matchedDegree = 0.95f;//[0~10] 0.9 } return matchedDegree; } }public class Location { private static final long serialVersionUID = 1L; /**经度*/ private String lng; /**纬度*/ private String lat; public Location(){ } public Location(String lng, String lat) { super(); this.lng = lng; this.lat = lat; } public String getLng() { return lng; } public void setLng(String lng) { this.lng = lng; } public String getLat() { return lat; } public void setLat(String lat) { this.lat = lat; } @Override public String toString() { return "Location [lng=" + lng + ", lat=" + lat + "]"; } } 3.最终计算(参考)//根据gps距离计算得到的匹配度 double distance = GpsUtils.distance(location1, location2); float gpsMatchedDegree = GpsUtils.getDegreeByDistance(distance); if (gpsMatchedDegree == 0.0f) { System.out.println("GPS距离过大"); return; } System.out.println("GPS匹配度:" + gpsMatchedDegree); //字符串相似度匹配 float stringDegree = JaroWinklerDistance.getDegree(address1, address2); System.out.println("字符串相似匹配度: " + stringDegree); float result = 0.0f; //根据两个匹配度确认最终相似度 if (stringDegree >= 0.5f && gpsMatchedDegree >= 0.5f) { result = Math.min(stringDegree, gpsMatchedDegree); } else if (stringDegree < 0.5f && gpsMatchedDegree < 0.5f) { result = Math.max(stringDegree, gpsMatchedDegree); } else { result = (stringDegree + gpsMatchedDegree) / 2; } System.out.println("最终匹配度: " + result);
2023年08月01日
166 阅读
0 评论
0 点赞
2023-05-11
【学习笔记】ES新特性
ES新特性1.概念ES全程EcmaScript,是脚本语言的规范,Javascript是EcmaScript的一种实现,所以ES新特性就是指JavaScript的新特性。1.1.编年史年份版本名称正式名称主要新增特性重要事件/说明1997ECMAScript 1ES1首个标准版本,定义了 JavaScript 的基础语法和功能。由 ECMA 国际组织发布,对应 ECMA-262 标准第一版。1998ECMAScript 2ES2小幅更新,与 ISO/IEC 16262 标准同步。修正规范格式,无重大功能新增。1999ECMAScript 3ES3正则表达式、try/catch 异常处理、instanceof、typeof、for...in 等。成为 JavaScript 的通行标准,广泛支持,奠定了现代 JavaScript 的基础。2007ECMAScript 4未发布模块、类、静态类型、泛型等激进特性(草案)。草案因争议被终止,部分功能被拆分到后续版本。2008ECMAScript 3.1ES5(重命名)基于 ES3 的小改进(如 JSON 支持、严格模式 use strict)。原计划为 ES4 的子集,后更名为 ES5。2009ECMAScript 5ES5严格模式、JSON 支持、新对象方法(如 Object.create)、数组迭代方法等。正式发布,成为主流浏览器支持的版本。2011ECMAScript 5.1ES5.1与 ISO/IEC 16262 国际标准同步,修正 ES5 的部分缺陷。成为 ISO 国际标准(ISO/IEC 16262:2011)。2013ECMAScript 6 草案ES6(草案冻结)类、模块、箭头函数、解构赋值、Promise、模板字符串等。草案冻结,不再添加新功能,进入讨论期。2015ECMAScript 2015ES6 / ES2015类、模块、箭头函数、解构赋值、let/const、Promise、模板字符串等。正式发布,首次按年份命名(ES6 后改称 ES2015),是 JavaScript 的重大升级版本。2016ECMAScript 2016ES7 / ES2016指数运算符(**)、Array.prototype.includes()。首个年度版本,新增特性较少但保持持续更新。2017ECMAScript 2017ES8 / ES2017Object.values()/keys()、async/await、for...of 与 let/const 支持。引入 async/await 简化异步编程。2018ECMAScript 2018ES9 / ES2018Promise.prototype.finally()、async 函数的 iterable 支持、Rest/Spread 运算符。增强异步编程和语法糖。2019ECMAScript 2019ES10 / ES2019Array.prototype.flat()/flatMap()、Object.fromEntries()、String.trimStart()/trimEnd()。更多数组和字符串方法的优化。2020ECMAScript 2020ES11 / ES2020Nullish 合并运算符(??)、动态导入(import())、BigInt、Promise.allSettled()。引入 BigInt 处理大整数,增强异步控制。2021ECMAScript 2021ES12 / ES2021Logical Assignment Operators(&&=、` 2.ES62.1.迭代器通过实现 Symbol.iterator 方法,可以自定义可迭代对象。const customIterable = { data: [10, 20, 30], [Symbol.iterator]() { let index = 0; const data = this.data; return { next() { if (index < data.length) { return { value: data[index++], done: false }; } else { return { done: true }; } } }; } }; for (const num of customIterable) { console.log(num); // 输出 10, 20, 30 }2.2.生成器生成器是一种特殊的函数,使用 function* 定义,通过 yield 关键字暂停和恢复执行,自动实现迭代器协议。yield 会暂停函数执行,返回一个值给外部迭代器,下次调用 next() 时从暂停处继续执行。function* simpleGenerator() { yield 1; yield 2; yield 3; } const gen = simpleGenerator(); console.log(gen.next()); // { value: 1, done: false } console.log(gen.next()); // { value: 2, done: false } console.log(gen.return(100)); // 提前终止,{ value: 100, done: true }function* infiniteSequence() { let i = 0; while (true) { yield i++; } } const gen = infiniteSequence(); console.log(gen.next().value); // 0 console.log(gen.next().value); // 1 // 无限继续...function* fibonacci() { let a = 0, b = 1; while (true) { yield a; [a, b] = [b, a + b]; } } const fib = fibonacci(); console.log(fib.next().value); // 0 console.log(fib.next().value); // 1 console.log(fib.next().value); // 1 console.log(fib.next().value); // 22.3.PromisePromise 是一个表示异步操作最终完成或失败的对象,代表了某个未来才会知道结果的事件(如网络请求、文件读写等)。用途:处理异步操作(如网络请求、文件读取),避免回调地狱(Callback Hell)。状态:pending(进行中):初始状态,既未成功也未失败。fulfilled(已成功):操作成功完成,调用 resolve 方法。rejected(已失败):操作失败,调用 reject 方法。特点:状态一旦改变(如从 pending 到 fulfilled 或 rejected),就不可逆,且后续操作只能通过 then 或 catch 处理结果。2.3.1.创建Promise通过 new Promise(executor) 构造函数创建:const myPromise = new Promise((resolve, reject) => { // 异步操作(如网络请求) const success = true; if (success) { resolve("操作成功"); // 成功时调用 resolve } else { reject("操作失败"); // 失败时调用 reject } });2.3.2.基础使用then:处理成功状态,接受两个参数(成功回调和失败回调):myPromise .then((value) => { console.log(value); // 输出 "操作成功" }, (error) => { console.error(error); });catch:专门处理失败状态,等同于.then(null, rejectCallback):myPromise .then((value) => console.log(value)) .catch((error) => console.error(error));finally 方法:无论成功或失败,都会执行 finally 回调(ES2018 引入):myPromise .then(data => console.log(data)) .catch(error => console.error(error)) .finally(() => console.log("操作完成"));示例代码:// 模拟异步请求 function fetchData() { return new Promise((resolve, reject) => { setTimeout(() => { Math.random() > 0.5 ? resolve("数据加载成功") : reject("网络错误"); }, 1000); }); } // 使用示例 fetchData() .then(data => console.log(data)) .catch(error => console.error(error)) .finally(() => console.log("请求结束"));2.3.3.应用场景网络请求:fetch("https://api.example.com/data") .then(response => response.json()) .then(data => console.log(data)) .catch(error => console.error("请求失败", error));文件读写:const readFile = (path) => new Promise((resolve, reject) => { fs.readFile(path, (err, data) => { if (err) reject(err); else resolve(data); }); });延时操作:const delay = (ms) => new Promise(resolve => setTimeout(resolve, ms)); delay(1000).then(() => console.log("1秒后执行"));2.3.4.高级用法(1) Promise.all()并行执行多个 Promise,全部成功后返回结果数组:const promise1 = Promise.resolve(1); const promise2 = new Promise((resolve) => setTimeout(() => resolve(2), 1000)); const promise3 = Promise.reject("错误"); Promise.all([promise1, promise2]) .then((results) => console.log(results)); // [1, 2] Promise.all([promise1, promise2, promise3]) .catch((error) => console.error(error)); // "错误"(2) Promise.race()返回最先完成的 Promise 结果或错误:const fastPromise = Promise.resolve("快"); const slowPromise = new Promise((resolve) => setTimeout(() => resolve("慢"), 2000)); Promise.race([fastPromise, slowPromise]) .then((result) => console.log(result)); // "快"(3) Promise.allSettled()(ES2020)等待所有 Promise 完成(无论成功或失败):const promises = [ Promise.resolve("成功"), Promise.reject("失败"), ]; Promise.allSettled(promises).then((results) => { results.forEach((result) => { if (result.status === "fulfilled") { console.log("成功:", result.value); } else { console.log("失败:", result.reason); } }); });2.4.模块化将一个大的程序文件,拆分成许多小的文件,然后将小文件组合起来。2.4.1.语法export用于暴露规定模块的对外接口import用于引入外部的模块接口示例//util.js export function isNull(){} export function isNotNull(){} //main.js import * from "./utils.js" import {isNUll, isNotNull} from "./utils.js" import {isNUll} from "./utils.js"html中编写:<script type="module"> import语句 </script2.4.2.暴露方式分别暴露export let a = 1; export function isNull(){} export function isNotNull(){}统一暴露let a = ; function isNUll(){} export { a, isNull }默认暴露export default { name: "angsan" play: function (){ console.log("play basketball") } }2.4.3.引入方式通用方式:import * as m1 from "./a.js"解构赋值方式:import {name as nickname, play} from "./a.js"、import {default as m1} from "./a.js"简单形式(针对默认暴露):import m1 from "./a.js"ES83.1.async&awaitasync: 声明一个异步函数,自动将函数返回值包装为 Promise。该函数返回一个 Promise 对象。如果函数返回一个普通值(如 return 1),则自动包装为 Promise.resolve(值)。如果函数抛出异常(如 throw new Error),则自动包装为 Promise.reject(错误)。async function fetchData() { // 异步操作(返回 Promise) return "数据"; } // 调用 async 函数 const result = fetchData(); // 返回 Promise result.then(data => console.log(data)); // "数据"await:暂停异步函数的执行,等待 Promise 完成并返回结果。暂停当前 async 函数的执行,直到等待的 Promise 完成(fulfilled 或 rejected)。不会阻塞主线程,其他代码可以继续执行。只能在 async 函数内部使用async function getData() { try { const response = await fetch("https://api.example.com/data"); // 等待 Promise const data = await response.json(); // 等待下一个 Promise console.log(data); } catch (error) { console.error("错误:", error); } } getData();执行流程示例async function asyncFunc() { console.log("Start"); const result = await slowPromise(); // 暂停此处,等待Promise完成 console.log("Result:", result); console.log("End"); } asyncFunc(); console.log("Main code continues..."); // 立即执行,不阻塞主线程 function slowPromise() { return new Promise(resolve => setTimeout(() => resolve("Done"), 1000)); }应用示例://定义一个返回Promise对象的异步函数 function sendAJAX(url){ return new Promise(()=>{ const x = new XMLHttpRequest(); x.open('GET', url); x.send(); x.onreadystatechange = function () { if(x.readyState === 4){ if(x.status >= 200 && x.status < 300){ resolve(x.response) }else{ reject(x.status) } } } }); } sendAJAX("1").then(value=>console.log(value)); async function send(){ try{ let result = await sendAJAX(""); console.log(result); }catch(error){ } } send(); ES11可选链操作符function readConfig(config){ const dbHost = config?.db?.host; console.log(dbHost);//10.0.0.1 } main({ db:{ host:'10.0.0.1', username:'root' }, cache:{ host:'10.0.0.1', username:'admin' } }); 动态import//方法内部 import('./').then((module)=>{ module.hello(); });globalThis全局对象,
2023年05月11日
163 阅读
0 评论
0 点赞
1
2
3