-1

i am new to shell scripting. my folder structure is like below format, in that every folder one file is there the file name is note.json, so i want to copy from note.json specific word like "user", i tried this for single file, it's working but showing unnecessary data and also i needed in loop format (means going to every folder doing the same) can any body help me out?

my folder structure:

drwxr-xr-x   - zeppelin hdfs          0 2020-06-01 16:20 /user/zeppelin/notebook/2FBC2M3K2
drwxr-xr-x   - zeppelin hdfs          0 2020-05-20 18:01 /user/zeppelin/notebook/2FBDEKUGP
drwxr-xr-x   - zeppelin hdfs          0 2020-05-26 20:32 /user/zeppelin/notebook/2FBDXNZRC
drwxr-xr-x   - zeppelin hdfs          0 2020-05-26 21:00 /user/zeppelin/notebook/2FBEAGZEE
drwxr-xr-x   - zeppelin hdfs          0 2020-05-25 14:18 /user/zeppelin/notebook/2FBGXSHZR
drwxr-xr-x   - zeppelin hdfs          0 2020-05-20 14:31 /user/zeppelin/notebook/2FBHCNKJP
drwxr-xr-x   - zeppelin hdfs          0 2020-06-02 17:34 /user/zeppelin/notebook/2FBJCZ212

I tried for single folder using below command,

$ cat note.json | grep "user"
"user": "Ayan.Paul",
            "data": "org.apache.hive.service.cli.HiveSQLException: Error while compiling statement: FAILED: HiveAccessControlException Permission denied: user [Ayan.Paul] does not have [USE] privilege on [snt_mmedata_upload_prd]\n\tat org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:300)\n\tat org.apache.hive.jdbc.Utils.verifySuccessWithInfo(Utils.java:286)\n\tat org.apache.hive.jdbc.HiveStatement.runAsyncOnServer(HiveStatement.java:324)\n\tat org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:265)\n\tat org.apache.commons.dbcp2.DelegatingStatement.execute(DelegatingStatement.java:291)\n\tat org.apache.commons.dbcp2.DelegatingStatement.execute(DelegatingStatement.java:291)\n\tat org.apache.zeppelin.jdbc.JDBCInterpreter.executeSql(JDBCInterpreter.java:718)\n\tat org.apache.zeppelin.jdbc.JDBCInterpreter.interpret(JDBCInterpreter.java:801)\n\tat org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:103)\n\tat org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:633)\n\tat org.apache.zeppelin.scheduler.Job.run(Job.java:188)\n\tat org.apache.zeppelin.scheduler.ParallelScheduler$JobRunner.run(ParallelScheduler.java:162)\n\tat java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)\n\tat java.util.concurrent.FutureTask.run(FutureTask.java:266)\n\tat java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)\n\tat java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)\n\tat java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)\n\tat java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)\n\tat java.lang.Thread.run(Thread.java:745)\nCaused by: org.apache.hive.service.cli.HiveSQLException: Error while compiling statement: FAILED: HiveAccessControlException Permission denied: user [Ayan.Paul] does not have [USE] privilege on [snt_mmedata_upload_prd]\n\tat org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:335)\n\tat org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:199)\n\tat org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:262)\n\tat org.apache.hive.service.cli.operation.Operation.run(Operation.java:247)\n\tat org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:541)\n\tat org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:527)\n\tat org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:315)\n\tat org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:562)\n\tat org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1557)\n\tat org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1542)\n\tat org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)\n\tat org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)\n\tat org.apache.thrift.server.TServlet.doPost(TServlet.java:83)\n\tat org.apache.hive.service.cli.thrift.ThriftHttpServlet.doPost(ThriftHttpServlet.java:208)\n\tat javax.servlet.http.HttpServlet.service(HttpServlet.java:707)\n\tat javax.servlet.http.HttpServlet.service(HttpServlet.java:790)\n\tat org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:848)\n\tat org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:584)\n\tat org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:224)\n\tat org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)\n\tat org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)\n\tat org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)\n\tat org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)\n\tat org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)\n\tat org.eclipse.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:493)\n\tat org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)\n\tat org.eclipse.jetty.server.Server.handle(Server.java:534)\n\tat org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)\n\tat org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)\n\tat org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283)\n\tat org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:108)\n\tat org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)\n\tat org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)\n\tat org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)\n\tat org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)\n\t... 3 more\nCaused by: java.lang.RuntimeException: org.apache.hadoop.hive.ql.security.authorization.plugin.HiveAccessControlException:Permission denied: user [Ayan.Paul] does not have [USE] privilege on [snt_mmedata_upload_prd]\n\tat org.apache.ranger.authorization.hive.authorizer.RangerHiveAuthorizer.checkPrivileges(RangerHiveAuthorizer.java:483)\n\tat org.apache.hadoop.hive.ql.Driver.doAuthorizationV2(Driver.java:1330)\n\tat org.apache.hadoop.hive.ql.Driver.doAuthorization(Driver.java:1094)\n\tat org.apache.hadoop.hive.ql.Driver.compile(Driver.java:705)\n\tat org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1863)\n\tat org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1810)\n\tat org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1805)\n\tat org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:126)\n\
wjandrea
  • 28,235
  • 9
  • 60
  • 81
  • hi guys, i need only user:ayan.paul in my file,remaing unnecessary data should be removed,this process for remaing folders also. – Vamshi Krishna Jun 02 '20 at 13:40
  • Does this answer your question? [Parsing JSON with Unix tools](https://stackoverflow.com/questions/1955505/parsing-json-with-unix-tools) and [Change directory using loop in linux](https://stackoverflow.com/q/46973632/4518341) – wjandrea Jun 02 '20 at 14:06
  • Although you might not need to cd in a loop. Try using a glob: `/user/zeppelin/notebook/*/note.json` – wjandrea Jun 02 '20 at 14:10

2 Answers2

1

As said above, if it is json structured the best and clean way is to use jq. otherwise, if this line always stay the same you can try:

cat note.json | grep "\"user\":" | sed 's/\"//g' | sed 's/,//g' | sed 's/ //g'

where

grep "\"user\":" - will take the the line you wanted

cut -d":" -f2 - will take from the second column by ":" separator

sed 's/\"//g' - remove "

sed 's/,//g' - remove commas

sed 's/ //g' - will remove spaces just in case ( you don't have to use it)

if you need the loop for it, lets say:

folder_Path='/path/to/myfolder'

files_in_folder=$(ls ${folder_Path})
for file in ${files_in_folder}
do
    if [[ ${file} == "note.json" ]]
    then
        cat ${file} | grep "\"user\":" | sed 's/\"//g' | sed 's/,//g' | sed 's/ //g' > ${new_file_path}
    fi
Dolevn
  • 36
  • 3
0

If you know that the note.json file always has "user" at the beginning of a line, then you can grep for that. It also sounds like you want the value of the "user" JSON field. Try using jq to parse that. Below is the "cheap and dirty" way of stripping out the extra characters. (We'll stick with a loop because you're probably doing something other things for each file...)

for file in $(find . -name note.json); do
    grep "^.user" $file | cut -c 10- | tr -d '",'
done

If you want help with using jq to parse JSON, just ask a different question showing a "note.json" file and your attempt at pasring it!

Eric Bolinger
  • 2,722
  • 1
  • 13
  • 22