Use Natural Language Understanding to analyze various features of text content at scale. Provide text, raw HTML, or a public URL, and IBM Watson Natural Language Understanding will give you results for the features you request. The service cleans HTML content before analysis by default, so the results can ignore most advertisements and other unwanted content.
Authenticator authenticator = new IamAuthenticator("<iam_api_key>");
NaturalLanguageUnderstanding service = new NaturalLanguageUnderstanding("2019-07-12", authenticator);
EntitiesOptions entities = new EntitiesOptions.Builder()
Features features = new Features.Builder()
AnalyzeOptions parameters = new AnalyzeOptions.Builder()
AnalysisResults results = service.analyze(parameters).execute().getResult();