Preface
Arthas
It is Alibaba's open source Java diagnostic tool. Online troubleshooting without restarting; dynamic tracking of Java code; real-time monitoring of JVM status. For online anomalies where every second counts, itArthas
can help us quickly diagnose related problems.
Download and install
Download Arthas
thearthas-boot.jar
wget https://alibaba.github.io/arthas/arthas-boot.jar
arthas
After downloading , first come to understand the help information, you can java -jar arthas-boot.jar -h
view it through commands, here are some examples and parameter descriptions
[root@izwz94a0v1sz0gk4rezdcbz arthas]# java -jar arthas-boot.jar -h
[INFO] arthas-boot version: 3.1.4
Usage: arthas-boot [-h] [--target-ip <value>] [--telnet-port <value>]
[--http-port <value>] [--session-timeout <value>] [--arthas-home <value>]
[--use-version <value>] [--repo-mirror <value>] [--versions] [--use-http]
[--attach-only] [-c <value>] [-f <value>] [--height <value>] [--width
<value>] [-v] [--tunnel-server <value>] [--agent-id <value>] [--stat-url
<value>] [pid]
Bootstrap Arthas
EXAMPLES:
java -jar arthas-boot.jar <pid>
java -jar arthas-boot.jar --target-ip 0.0.0.0
java -jar arthas-boot.jar --telnet-port 9999 --http-port -1
java -jar arthas-boot.jar --tunnel-server 'ws://192.168.10.11:7777/ws'
java -jar arthas-boot.jar --tunnel-server 'ws://192.168.10.11:7777/ws'
--agent-id bvDOe8XbTM2pQWjF4cfw
java -jar arthas-boot.jar --stat-url 'http://192.168.10.11:8080/api/stat'
java -jar arthas-boot.jar -c 'sysprop; thread' <pid>
java -jar arthas-boot.jar -f batch.as <pid>
java -jar arthas-boot.jar --use-version 3.1.4
java -jar arthas-boot.jar --versions
java -jar arthas-boot.jar --session-timeout 3600
java -jar arthas-boot.jar --attach-only
java -jar arthas-boot.jar --repo-mirror aliyun --use-http
WIKI:
https://alibaba.github.io/arthas
Options and Arguments:
-h,--help Print usage
--target-ip <value> The target jvm listen ip, default 127.0.0.1
--telnet-port <value> The target jvm listen telnet port, default 3658
--http-port <value> The target jvm listen http port, default 8563
--session-timeout <value> The session timeout seconds, default 1800
(30min)
--arthas-home <value> The arthas home
--use-version <value> Use special version arthas
--repo-mirror <value> Use special maven repository mirror, value is
center/aliyun or http repo url.
--versions List local and remote arthas versions
--use-http Enforce use http to download, default use https
--attach-only Attach target process only, do not connect
-c,--command <value> Command to execute, multiple commands separated
by ;
-f,--batch-file <value> The batch file to execute
--height <value> arthas-client terminal height
--width <value> arthas-client terminal width
-v,--verbose Verbose, print debug info.
--tunnel-server <value> The tunnel server url
--agent-id <value> The agent id register to tunnel server
--stat-url <value> The report stat url
<pid> Target pid
start up
Start arthas
before a first start springboot
of the application. The demo
address is github.com/yangtao...
java -jar ytao-springboot-demo.jar
Start arthas-boot.jar
command
java -jar arthas-boot.jar
Note here that you need to start demo
and arthas
use the same authority user, otherwise you can't get the process information using the attach mechanism (I didn't pay attention when I just used it here, I encountered this problem). Example: root
user-initiated demo
, u1
the user starts arthas
when the print informationCan not find java process. Try to pass <pid> in command line.
Check the source code and add log output after getting the process. If the result is empty, return -1
, and if the judgment result is less than 0
, exit directly.
Start class Bootstrap#main
code
Process tool ProcessUtils#select
code
Through the above analysis, we arthas
must start our target process before we start , otherwise it arthas
may not start.
Use the root
user to start the success interface
Select java process, here we ytao-springboot-demo
are 1, there will be connection information after selection
[INFO] arthas home:/root/.arthas/lib/3.1.4/arthas
[INFO] Try to attach process 22005
[INFO] Attach process 22005 success.
[INFO] arthas-client connect 127.0.0.1 3658
,---. ,------. ,--------.,--. ,--. ,---. ,---.
/ O /| .--. ''--. .--'| '--' |/ O /' .-'
| .-. || '--'.' | | | .--. || .-. |`. `-.
| | | || |// | | | | | || | | |.-' |
`--' `--'`--' '--' `--' `--' `--'`--' `--'`-----'
wiki https://alibaba.github.io/arthas
tutorials https://alibaba.github.io/arthas/arthas-tutorials
version 3.1.4
pid 17339
time 2019-10-17 02:29:06
dashboard data panel
Use dashboard
commands to view thread, memory, GC, and Runtime information
jad decompilation
Sometimes we will encounter that the online code running result is not what we expect. There are cases where the online code is not the version we want, but if you want to view it, you need to download it and then decompile it. At this time arthas
, it jad
can help us perform real-time decompilation online to confirm whether the code conforms to our version.
jad com.ytao.service.UserServiceImpl
watch function execution information
Use the watch
command to view the execution information of the function. watch
List of parameters (from the official website)
parameter | Parameter Description |
---|---|
class-pattern | Class name expression matching |
method-pattern | Method name expression matching |
express | Observe the expression |
condition-express | Conditional expression |
[b] | Observe before the method call |
[e] | Observe after the method is abnormal |
[s] | Observe after the method returns |
[f] | Observe after the method ends (normal return and abnormal return) |
[E] | Turn on regular expression matching, the default is wildcard matching |
[x:] | Specify the attribute traversal depth of the output result, the default is 1 |
When we encounter online data bug
, our general processing method is to simulate the online data in the development environment, find clues from the production log, or remotely debug
. Regardless of the above investigation methods, they are relatively troublesome. At this time, Arthas watch
can help us view real-time code execution. Expressions can be viewed using the observation function
,
,
. Observation expressions are mainly composed of OGNL
expressions, so you can write OGNL
expressions to execute them.
Observe the variables of the expression
variable | Variable description |
---|---|
params | Input parameters of the function |
returnObj | The return value of the function |
throwExp | Exception information |
target | Current object |
View the input parameters and return value of a function
watch com.ytao.service.UserServiceImpl getUser "{params,returnObj}"
In the printed information, isEmpty=false;size=1
you can see that the parameter is not empty and the number of parameters is one. View specific entry information
watch com.ytao.service.UserServiceImpl getUser "{params[0],returnObj}"
View exception information
watch com.ytao.service.UserServiceImpl getUser "throwExp"
When we pass a parameter -1
, the print out illegal parameters we define exceptions
watch
In addition to observing expressions, you can also use
, as well
.
Note that when using the observation event point, some variables of the observation expression may not exist, for example -b
, when using , the return value and exception information are both empty.
Sometimes when we troubleshoot a function, we can't get the information of the function right away, and the information arthas
provided
can help us record the log. The usage is similar to that of Linux.
watch com.ytao.service.UserServiceImpl getUser "{params,returnObj}" >/log/w.log &
View asynchronously saved logs
tt locate abnormal call
The watch
functions described above can be used to check the call situation, which is more suitable for checking the information after the possible situation of the current call is known. If a function is called n times, there are a few execution exceptions, we have to find out these abnormal calls, it watch
is not very convenient to troubleshoot. Use tt
commands to view abnormal calls and information more easily. The right com.ytao.service.UserServiceImpl#getUser
function view -t
is recorded every time the function is called
tt -t com.ytao.service.UserServiceImpl getUser
record information
View all records
tt -l
View the specified function record
tt -s 'method.name=="getUser"'
Output information description
Form field | Field explanation |
---|---|
INDEX | Time segment record number, each number represents a call, and many subsequent tt commands specify record operations based on this number, which is very important. |
TIMESTAMP | The local time when the method was executed, recording the local time that occurred in this time segment |
COST(ms) | Method execution time |
IS-RET | Whether the method ends in the form of normal return |
IS-EXP | Whether the method ends by throwing an exception |
OBJECT | Execute the hashCode() of the object. Note that someone once mistakenly thought that it was the memory address of the object in the JVM, but unfortunately he was not. But it can help you simply mark the class entity that currently executes the method |
CLASS | Class name to be executed |
METHOD | The name of the method to be executed |
From the above parameters, we can see that the 1003
call ends in the form of throwing an exception, because tt
the information of each call is recorded, so we can view 1003
the detailed information
tt -i 1003
trace view call link
We often encounter that the rt is too long when calling an api. We have to find out one or several functions in the call chain to optimize. We usually locate several possible anchor points and print the rt between each anchor point. Or find out the log printing time point from the log and calculate the time difference, no matter which method is used, it is more cumbersome. When using arthas
the trace
command, we can easily complete our needs.
trace
Parameter Description
parameter | Parameter Description |
---|---|
class-pattern | Class name expression matching |
method-pattern | Method name expression matching |
condition-express | Conditional expression |
[E] | Turn on regular expression matching, the default is wildcard matching |
[n:] | Command execution times |
#cost | Method execution time-consuming |
Use the trace
output com.ytao.controller.UserController#getUser
information
trace com.ytao.service.UserServiceImpl getUser
Output result
In the process of actual use and troubleshooting, in order to reduce the output of useless information, we generally use #cost
filtering time-consuming and jdk's own functions, which can be ignored to reduce the output of information. For example: filter out 1ms
calls less than
trace com.ytao.service.UserServiceImpl getUser '#cost > 1'
redefine implements hot deployment
When we found bugs and wanted to go online quickly to save the common people, Arthas
we prepared redefine
commands for us to implement hot updates. Although now advocating jad
/mc
/redefine
heat more one-stop, but a good line of code is recommended to replace locally compiled and then, to avoid misuse hands. First UserServiceImpl
add a line of code in
Get classLoaderHash
, sc
get class information through commands
sc -d *UserServiceImpl
redefine
Class to perform the modification
redefine -c 1d56ce6a/usr/local/jar/UserServiceImpl.class
Verify whether the UserServiceImpl
class
is updated through the printed information
Arthas
In addition to the use of the above, there are some other diagnostic functions, which are only methods I personally use. However, you must have a combination of punches to use this type of tool, and there are corresponding troubleshooting methods for problems encountered in the process of troubleshooting, not blindly.
Personal blog: ytao.top
My official account ytao