Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't aggregate babar.log to yarn logs #17

Open
andres-lago opened this issue Sep 3, 2018 · 2 comments
Open

Can't aggregate babar.log to yarn logs #17

andres-lago opened this issue Sep 3, 2018 · 2 comments

Comments

@andres-lago
Copy link

Hello,
I can't get the babar.log aggregated to yarn logs. When I run the command to get the logs:
yarn logs --applicationId application_XXXX_YYYY > myAppLog.log

the resulting myAppLog.log doesn't contain the traces of babar.log. It contains only my application and yarn log messages.

It is not a problem when I launch the application from the linux command line (calling directly spark2-submit) because the file babar.log is created in the same directory. But when I launch it from an oozie workflow (production environment), the file babar.log dissapears when the container terminates and its content is not aggregated.

I realized that the system properties: yarn.app.container.log.dir and spark.yarn.app.container.log.dir are null, then babar uses a local directory where log is stored : ./log. Could it be the reason? Anyone has observed the same problem?

@BenoitHanotte
Copy link
Contributor

Hello!
Yes it could be the problem, the agent uses the environment properties to find where to store the log file. It the properties are not found, it will store to a local folder named log.
If you tried to replace the environemnt property by getting the value from an initialized hadoop configuration, would you get a correctly set value?
Something like:

Configuration conf = new Configuration();
conf.get("yarn.app.container.log.dir")

You could also try to specify a custom log directory from the agent parameters that you add to your java options.

@andres-lago
Copy link
Author

Hi Benoit,
thanks for your support. We've chosen the easiest solution, write the babar.log to a local directory outside the container (/tmp) and collect it afterwards manually from the concerned server (driver's server in our case). I couldn't find an easy solution to get the property value (directory of yarn containers' logs) and pass it to babar before launching spark2-submit from an oozie workflow.

We activated babar only for the driver, we didn't arrive to launch it in the executors. But as we're actually working in a problem with the driver, then it's enough by now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants