0

I'm working with the Twitter API and I'm trying to get only the first news of a tweet and its link. If I console.log tweetNews I get something like this as example:

 {
    text: 'exampletext. \n' +
      '\n' +
      'exampletext. \n' +
      '\n' +
      'https://examplelink',
    href: 'https://examplelink
  },

I would like to get only the first string of the text, removing also the link inside the text. Down below the rest of my code:

module.exports.filterTweets = (parsedArray) => {
    
    const filteredTweets = parsedArray.filter(
        (tweet) => tweet.entities.urls.length === 1
    );
    
    
    const tweetNews = filteredTweets.map((x) => {
        return { text: x.full_text, href: x.entities.urls[0].url };
    });
    
    console.log("tweetNews", tweetNews);
    return tweetNews;
};
marbobby13
  • 53
  • 6
  • By "get only the first string of the `text`" I think you mean you want to split on the newline character? – Michael Rush Mar 09 '21 at 14:59
  • `text` is one string, it's just that it's been split into parts at the string concatenation operator (`+`). Do you mean you want the part up to the first `\n`? – Andrew Morton Mar 09 '21 at 14:59
  • Yes exactly, I want only the first part up to the first `\n` and i want to remove anything that contains `https` inside my `text` – marbobby13 Mar 09 '21 at 15:01

1 Answers1

0

I'm not sure I 100% follow what you are asking for, but this should get you on your way. This shows how to get the first part of the string (up to the first line break), and how to remove (http/https) URLs:

function cleanInput(input){
 let firstLine = input.split(/\s+\n\s+/)[0]
 //this regex is overly simplistic, just for demo purposes
 //see more complete example at https://stackoverflow.com/a/3809435/57624
 let withoutURL = firstLine.replace (/http[s]?:\/\/[a-z./]+/i, '');
 return withoutURL
}

let sampleData1 = {
    text: 'exampletext. \n' +
      '\n' +
      'exampletext. \n' +
      '\n' +
      'https://examplelink',
    href: 'https://examplelink'
    };
  
let sampleResult1 = cleanInput(sampleData1.text)
console.log("result 1: " + sampleResult1)

let sampleData2 = {
    text: 'A URL: https://examplelink.com \n' +
      '\n' +
      'exampletext. \n' +
      '\n' +
      'more',
    href: 'https://examplelink'
    };
  
let sampleResult2 = cleanInput(sampleData2.text)
console.log("result 2: " + sampleResult2)
Michael Rush
  • 3,950
  • 3
  • 27
  • 23
  • I'm running your code and it works, but then inside my project it doesn't. The object is a tweet from the Twitter API, so I'm basically trying to run the news inside my local website, which they come in a group of 15 each time I guess. I don't know if this detail makes the difference. So when I'm returning `text: x.full_text` in my function, the `x.full_text` needs to be as said before, without anything after `https` and just the first line of the string. – marbobby13 Mar 09 '21 at 15:33
  • `const tweetNews = filteredTweets.map((x) => { const text = x.full_text; const link = text.indexOf(x.entities.urls[0].url); return { text: text.slice(0, link - 1), href: x.entities.urls[0].url, }; }); ` At the moment for example I managed to remove that link inside `text` with the code above. But it's still returning a way too long text. I don't know if this makes more sense. – marbobby13 Mar 09 '21 at 16:00
  • "still returning a way too long text" -- sounds like maybe you want to truncate to a limited length? Maybe you can provide an example of `x.full_text` (obfuscating the content if necessary) and then show how you'd like the result to look? – Michael Rush Mar 09 '21 at 16:24